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About This Manual 



The information contained in this manual is intended for Digital Customer Services personnel 
responsible for RA90/RA92 disk drive maintenance and service calls. 

This manual contains checkout, servicing, and troubleshooting information for RA90 and RA92 disk 
drives. Procedures for unpacking, deskidding, and cabling 60-inch cabinets are also included. 

Procedures for installing RA90 and RA92 add-on disk drives in 60-inch cabinets are not included in 
this manual. Refer to product-specine documentation. 
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Introduction 



t s i RA90 and RA92 Disk Drive Descriptions 

The RA90 and RA92 disk drives are high density, fixed-media disk drives which use nonremoveable, 
thin film media and thin film heads. The RA90/RA92 heads, disks, rotary actuator, and filtering 
system are encased in a single unit called the Head Disk Assembly (HDA). 

The RA90 disk drive has a formatted data storage capacity of 1216 gigabytes and an unformatted 
data storage capacity of 1.604 gigabytes in a 16-bit word format. The RAQ2 disk drive has a 
formatted data storage capacity of 1.506 gigabytes and an unformatted data storage capacity of 
1.987 gigabytes in a 16-bit word format. 

Thirteen surfaces contain data and embedded servo information. The embedded servo information 
is within the intersector gaps. The embedded servo information accomplishes fine positioning of 
read/write heads over the data tracks. Figure 1—1 is an example of the sector format used for 
RA90/RA92 disk drives. 

The fourteenth surface is a dedicated servo surface that, when decoded by the drive electronics, 
provides information on: 

• Coarse radial position 

• Track crossing (velocity) 

• Rotational index and sector position 

• Generation of clock synch pulse 

• Inner and outer guardband detection 
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Figure 1-1 Example of Sector Format 
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1.1.1 Physical and Logical Media Layout 

The physical structure of the media is transparent to the user. Figures 1—2 and 1-3 represent the 
layout of logical information for the RA90 and RA92 media. 
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Figure 1-2 RA90 Physical and Logical Media Layout 

1.2 Maintenance Strategy 

The RA90 and RA92 disk drives introduce a new approach to repairing peripheral equipment. In 
most cases, RA90/RA92 disk drives afford easy access to field replaceable units (FRU) without the 
use of tools. 

Additional drive maintenance features include the following: 

• A microprocessor-controlled operator control panel (OCP) interface eliminating the need for 
external test equipment 

• EEPROM where an internal error log is stored 

• Twelve error recovery levels 

• Extensive drive-resident diagnostics 

• Drive microcode that can be updated by way of the microcode update port 
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Figure 1-3 RA92 Physical and Logical Madia Layout 
1.2.1 Service Delivery Strategy 

Real-time subsystem (drive) faults detected by the drive are recorded in the RA90/RA92 drive 
internal error log. Real-time faults detected in the disk subsystem are recorded in the supporting 
system host error log. Controller-detected errors (such as ECC errors) are also logged to the host 
error log and not the RA90/RA92 drive-resident error log. 

Use utility programs to obtain a print-out of the drive internal error log and isolate faults, provided 
the error was drive-detected. Additionally, you can run the RA90/RA92 drive-resident utility T41 
to access the drive internal error log. This provides the drive LED error codes only. Use of other 
utility programs provides additional error information. 

Use drive-resident diagnostics to validate repairs to RA90/RA92 disk drives. For more information 
on drive-resident diagnostics and utilities, refer to Chapter 4. 

1.2.1.1 Six-Step Maintenance Strategy 

This section describes the maintenance strategy for RA90 and RA92 disk drives. Become familiar 
with it as it determines the course of action necessary to successfully service RA90/RA92 disk 
drives. 

Implement the following six-step maintenance strategy on each service call for a drive problem: 

1. Examine and analyze VAXsimPLUS. 

2. Examine and analyze system error logs. 

3. Examine and analyze the drive internal error log. 

4. Correlate failure symptoms to the probable failing FRU through service documentation. 
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5. Replace the FRU only after a prime FRU is identified from previous steps. 

6. Verify device repair through drive-resident diagnostics. (Running host-level diagnostics to verify 
repairs is unnecessary and penalizes the customer by tying up the system.) 

Use host-based diagnostics only as a last resort, to obtain symptomatic failure information, and 
only if system and drive error logs are unavailable. 

Verify the drive is on line and operational through normal system-level commands that access the 
unit under repair. 

1.2.2 Tools Required for Maintenance 

Tools required for maintaining RA90/RA92 disk drives are identified in the procedures where they 
are needed and in Chapter 6. 

1.2.3 Preventative Maintenance 

Customer responsibilities for preventative maintenance (60-inch cabinets only) are described in 
Appendix C. 

Digital Customer Services responsibilities for cabinet and RA90/RA92 disk drive maintenance are 
described in Appendix D. 

1 .3 RA90/RA92 Disk Drive Specifications 

Table 1-1 lists important operating and nonoperating specifications for RA90 and RA92 disk 
drives. 



Table 1-1 Specifications for RA90 and RA92 Disk Drives 



Characteristic 



RA90 Disk Drive 



RA92 Disk Drive 



Head Disk Assembly (HDA) 



Storage capacity, formatted 

Storage capacity, unformatted 

HDA word format 

Bits/square inch 

Tracks/inch 

Disk recording method 

Number of disks 

Disk surfaces 

Number of beads 

Heads per surface 

Data tracks 

Logical cylinders 

User logical cylinders 

Number of sectors 

Number of logical blocks 



1.216 gigabytes 

1.804 gigabytes 

16-bit only 

40 megabits 

1750 

Rate 2/3 modulation code 

7 

14 (13 data and 1 servo) 

14 

1 

34,437 

2656 

2649 

69+1 spare 

2,376,153 



1.506 gigabytes 

1.9S7 gigabytes 

SameasRA90 

49.4 megabits 

2045 

Same as RA90 

Same as RA90 

SameasRA90 

Same as RA90 

Same asRA90 

40,287 

3101 

3099 

73 + 1 spare 

2,942,849 
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Table 1-1 (Cont.) Specifications for RA90 and RA92 Disk Drives 



Characteristic 


RA90 Disk Drive 


RA92 Disk Drive 


Seek Tunes 


One cylinder 


5.5 milliseconds 


3.0 milliseconds 


Average seek 


18.5 milliseconds 


16.0 milliseconds 


Maximum cylinder seek 


31.5 milliseconds 


29.0 milliseconds 


Latency 


Rotation speed 


3600r/min 


3405r/min 


Average latency 


8.33 milliseconds 


8.81 milliseconds 


Maximum latency 


16.67 milliseconds 


17.62 milliseconds 


Single Start/Stop Time 


Start (maximum) 


40 seconds 


Same as RA90 


Inhibit between stop and restart 


40 seconds 


Same as RA90 


Data Rates 


Transfer rate 


2.77 megabytes/sec 


Same as RA90 


Physical Characteristics 


Height 


26.56 cm (10.42 inches) 


Same as RA90 


Width 


22.19 cm (8.74 inches) 


SameasRA90 


Depth 


68.47 cm (26.96 inches) 


SameasRA90 


Weight 


31.8 kg (70 pounds) 


Same as RA90 


Inrush Current 


120 Vac 


60 amperes peak @ 132 Vac 


Same as RA90 


220-240 Vac 


70 amperes peak @ 264 Vac 


Same asRA90 


Running current for: 






120 Vac 


4.6 amps 


SameasRA90 


220-240 Vac 


2.4 amps 


Same as RA90 


Power factor: 






120 Vac 


0.7 


SameasRAdO 


220-240 Vac 


0.58 


Same as RA90 


line cord length (from the cabinet) 


2.74 meters (9 feet) 


Same as RA90 
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Table 1-2 contains additional electrical specifications by model for RA90 and RA92 disk drives. 

NOTE 

The RA90 and RA92 disk drives are not line-frequency dependent. 

Table 1-2 Additional Electrical Specifications by Model for RA90 and RA92 Disk Drives 

Input Current (Amps) 



Nominal Start-Up Power BTUs/Hour 

Model Voltage Current PHI Neutral Dissipation [Kj/Hour]' 



RA90-3s/RA92-xs: 120 volts 5.0 3.4 3.4 281 Watts 

RA90-xx/RA92-xjc 240 volts 2.35 1.45 1.45 271 Watts [976] 

1 Currents are for nominal voltages of 120 Vac phase to neutral or for 240 Vac phase to neutral. Far 101 Vac and 220 
Vac nominal voltages, the drives will have proportionately higher phase currents by a ratio of 120/101 or 240/220 to the 
currents specified in this table. 

2 Bracketed figures indicate kilojoules per hour. 

Table 1-3 shows the maamum environmental limits and the recommended environmental 
operating ranges to optimize equipment performance and reliability. 

Table 1-3 RA90/RA92 Environmental Limits 



Characteristic RA90/RA92 Disk Drive 



Maximum Environmental. Limits 



Temperature (Required) 

Operating 10°C to 40°C (50°F to 104°F) with a temperature gradient of 

20°C/hour (36°F/hour) 

Nonoperating -40°C to +60°C (-40°F to +140°F) 

Relative humidity 

Operating 10% to 90% (noncondensing) with a minimum wet bulb 

temperature of 28°C (82°F) and a minimum dew point of 2°C 
(36° F) 

Nonoperating 10% to 90% with no condensation 
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Table 1-3 (Cont.) RA90/RA92 Environmental Limits 



Characteristic 



RA90/RA92 Disk Drive 



Recommended Environmental Operating Ranges 



Temperature 

Relative humidity 

Air quality (maximum particle count) 

Air volume (at inlet) 

Altitude 

Operating 

Nonoperating 



18°C to 24°C (64.4°F to 75.2°F) with an average rate of change 
of 3°C/hour maximum and a step change of 3°C or less 

40% to 60% (noncondensing) with a step change of 10% or less 
(noncondensing) 

Not to exceed 500,000 particles per cubic foot of air at a size of 
0.5 micron or larger 

50 cubic feet per minute (.026 cubic meters per second) 



Sea level to 2400 meters (8000 feet); maximum allowable 
operating temperatures are reduced by a factor of 1.8°C/1000 
meters (1°F/1000 feet) for operation above sea level 

300 meters (1000 feet) below sea level to 7500 meters (16,000 
feet) above sea level (actual or effective by means of cabin 
pressurization) 



1 .4 Electrostatic Protection 

Electrostatic discharge (ESD) is the result of electrostatic buildup and its subsequent release. The 
surface storage of an electrostatic charge from a person or object can damage hardware components 
and may result in premature device or option failure. 

The basic concept of static protection for electronic components is the prevention of static buildup, 
where possible, and the safe release of existing electrostatic charge buildup. If the charged object is 
a conductor, such as an object or person, complete discharge can be achieved through grounding the 
person or object. 

Use the following guidelines when handling static-sensitive components and modules: 

CAUTION 

Always use grounding straps to avoid product damage when handling static-sensitive 

components and modules. 

1. Read all instructions and installation procedures included with static control materials and 
kits. 

2. Use static-protective containers to transfer modules and components (including bags and tote 
boxes). 

3. Wear a properly grounded ESD wrist strap when handling components, modules, or other 
static-sensitive devices. Figure 1-4 shows the ESD wrist strap in use. 

When using an ESD wrist strap: 

• Ensure the wrist strap fits snugly for proper conductivity. 

• Attach the alligator clip securely to a clean, unpainted, grounded metal surface such as the 
drive chassis or cabinet frame. 

• Do not overextend the grounding cord. 
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Figure 1-4 ESD Wrist Strap 
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Installation 



2.1 Introduction 

The SA600 and SA650 cabinets are the most commonly used cabinets for RA90 and RA92 disk 
drives. Procedures for unpacking, deskidding, and cabling 60-inch| cabinets are contained in this 
chapter. This chapter also covers site preparation and planning considerations, drive acceptance 
testing procedures, and power-up diagnostics. 

Information on unpacking and installing add-on RA90 and RA92 disk drives in 60-inch cabinets can 
be found in product-specific documentation and is not covered here. 

2.2 Site Preparation and Planning 

Site preparation and planning are necessary before installing an RA90 or RA92 disk drive 
subsystem. Ohapter ± contains a ruli iraiige of recommended environmental specifications, jjq 
addition, consider the following items before attempting installation. 

2.2.1 Power and Safety Precautions 

The RA90/RA92 disk drives do not present any unusual fire or safety hazards. It is recommended, 
however, that you check ac power wiring for the computer system to determine adequate capacity 
for expansion. 

2.2.2 Three-Phase Power Requirements 

The RA90 and RA92 disk drives use a single-phase power supply; however, the 881 power controller 
uses three phases. It is very important that the correct phase requirements for this product be met. 
Refer to Chapter 1 for power specifications. 

WARNING 

Hazardous voltages are present in this equipment. Installation and service must be 
performed by trained service personnel. Bodily injury or equipment damage may result 
from incorrect servicing. 

lb prevent damage to equipment and personnel, ensure power sources meet the specifications 
required for this equipment. 



t The SA600 and the SA650 are both 60-inch cabinets. 
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2.2.3 AC Power Wiring 

The wiring used by Digital Equipment Corporation conforms to UL, CSA, and ISE standards. 
Figure 2-1 shows the ac plug configurations for RA90 and RA92 disk drives and 881 and 874 power 

<i_~ti 

tuuuuucis. 

2.2.4 Thermal Stabilization 

Thermal stabilization prevents temperature differences between the equipment and its environment 
from damaging disk drive components. 

Prior to installation, a 60-inch cabinet subsystem and the RA90/RA92 add-on drive must be stored 
at a temperature of 60°F (16°C) or higher for a minimum of 24 hours. These units may be stored 
either in the computer room or in another storage room under controlled temperature conditions. If 
stored in another storage room, each unit must sit for an additional hour in the computer room in 
which it is to be installed. 

CAUTION 

The thermal stabilization procedure is ma n datory. Do not open the moisture barrier bag 
until after the thermal stabilization period. Failure to thermally stabilize the equipment 
may eause premature equipment failure. 

After the thermal stabilization criteria has been met, carefully cut the moisture barrier bag and 
proceed with the installation. 

2.2.5 Floor Loading 

Consider the placement of this equipment, especially if a fully loaded 60-inch configuration is used. 
A fully loaded 60-inch cabinet weighs approximately 390 kilograms (860 pounds). Each RA90 or 
RA92 disk drive weighs approximately 31.8 kilograms (70 pounds). 

2.2.6 Operating Temperature and Humidity 

The required relative humidity range is between 10 percent and 90 percent with a minimum wet 
bulb temperature of 28°C (82°F) and a minimum dew point of 2°C (36°F) (non-condensing) with a 
step change of 10 percent or less. 

The RA90 and RA92 disk drives can be operated within temperatures of 10°C to 40°C (50°F to 
104°F). However, it is highly recommended that RA90 and RA92 disk drives be operated in a 
temperature range below 25°C (77°F) to increase reliability and extend product life. 

2.3 Unpacking the Cabinet 

The 60-inch cabinet configuration is packed in a cardboard carton attached to a wooden shipping 
pallet. Refer to Figure 2-2 and use the following procedure to unpack the cabinet: 

1. Inspect the shipping carton for any sign of external damage. Report any damage to the local 
carrier and to the Digital Customer Services or sales office. 

2. Remove the two cardboard U-sections but leave the sealed moisture barrier with desiccant in 
place during thermal stabilization. 

CAUTION 

This equipment must be thermally stabilized in the site environment for at least 24 

hours before operation. 
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Figure 2-2 Unpacking the 60-Inch Cabinet 
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2.3.1 Deskidding the Cabinet 

Three people are required to deskid the 60-inch cabinet. See Figure 2-3. 

WARNING 

Serious injury could result if the cabinet is improperly handled. 




CXO-924A_S 

Figure 2-3 Cabinet Deskidding 

1. Remove the two unloading ramps from their carton located under the carton top cover. 

2. Inspect the ramps, ramp side rails, and metal hardware for defects described in the following 
list: 

• Cracks more than 25 percent of the ramp depth, either across or lengthwise on the ramp. 

• Knots or knotholes going through the thickness of the ramp and greater than 50 percent of 
the ramp width. 

• Loose, missing, or broken ramp side rails. 

• Loose, missing, or bent metal hardware. 

If any of these conditions exist, do not use that ramp. Investigate alternate means of removing 
the cabinet and/or order a new ramp. The part number for the left ramp is 99-07689-01; the 
part number for the right ramp is 99-07689-02. 
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3. Remove shipping bolts from the shipping brackets on each of the four levelers. See inset in 
Figure 2-2. 

4. Remove shipping brackets from the four cabinet levelers. 

5. Fasten unloading ramps onto the pallet by fitting the grooved end of each ramp over the metal 
mating strip on the pallet. See Figure 2-4. 
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Figure 2-4 Ramp Installation of Shipping Pallet 

6. Screw the cabinet levelers (Figure 2-5) all the way up until the cabinet rests on its rollers on 
the pallet. 

7. Carefully roll the cabinet down the ramps (three people are required). 

8. Move the cabinet into its final position. 

9. Turn each leveler hex nut clockwise until the leveler foot contacts the floor (no weight on the 
casters) and the cabinet is level. 
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2.4 Installing SDI Cables and Power Cords 

Generally, SDI cables and power cords are installed in the 60-inch cabinet prior to shipping. Use 
this section as a reference should you need to remove or reinstall the power cords or SDI cables. 
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Figure 2-5 Leveter Adjustment 

2.4.1 Removing the Front and Rear Access Panels 

Use the following procedure to remove front and rear cabinet access panels. 

2.4.1.1 Front Access Panel Removal 

Refer to Figure 2-6 while performing this procedure: 

1. Use a hex wrench or flat-bladed screwdriver to unlock the two quarter-turn fasteners at the top 
of the panel. Turn the fasteners counterclockwise. 

2. Grasp the panel by its edges, tilt it toward you, and lift it up about 2 inches. Remove the panel 
and store it in a safe place. 

To reinstall the front panel, lift it into place and lower it straight down until the tabs on the panel's 
lower edge engage the slots in the cabinet support bracket. Hold the panel flush with the cabinet 
and use a hex wrench to lock the fasteners. 
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Figure 2-6 Front Panel Removal 
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2.4.1.2 Removing the Rear Access Panel 

Refer to Figure 2-7 while performing this procedure: 

1. Use a hex wrench or flat-bladed screwdriver to unlock the two quarter-turn fasteners at the top 
of the panel. Turn the fasteners counterclockwise, 

2. Tilt the panel toward you and lift it up to disengage the pins at the bottom. 

3. lift the panel clear of the enclosure and store it in a safe place. 

When replacing the rear panel, lift it into place and fit the pins into the holes at the top of the I/O 
bulkhead. Push the top of the panel into place and turn the quarter-turn fasteners clockwise. 




I/O BULKHEAD 



Figure 2-7 Rear Access Panel Removal 
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2.4.2 SDI Cable Connections and Routing 

Both external and internal cables are connected to the I/O bulkhead located at the base of the drive 
cabinet. See Figure 2-8. Refer to product-specific documentation for more information. 
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Figure 2-8 SDI Cable Connections and Routing— SA600 Example 
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2.4.3 Power Cord Connections and Routing 

Figure 2-9 shows drive power cord connections and the recommended power cord routing for an 
SA600 storage array cabinet. Refer to product-specific documentation for power cord connections 
and routing for other subsystems. 
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Figure 2-9 Power Cord Connections and Routing— SA600 Example 
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2.5 Locating the RA90/RA92 Disk Drive Power Supply 

Tb access the RA90 or RA92 disk drive power supply, remove the cabinet rear access panel 
(Figure 2-7). Figure 2-10 shows the location of the RA90/RA92 disk drive power supply, circuit 
breaker, and the Power OK LED. 
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Figure 2-10 RA90/RA92 Power Supply Controls and Indicators 
2.5.1 Plugging in the Power Cord 

The drive power cords in a fully-configured cabinet are already plugged into the power controller. 
Only the ac power cord from the cabinet power controller needs to be plugged into an external 
power source. 

NOTE 

Do not apply power to the power controller until proper voltage has been selected. 

(Refer to Section 2.7.1.) 
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2.6 International Operator Control Panel Labeling 

Each drive unit or cabinet configuration is snipped with a set of international labels for the operator 
control panel (OOP). The labels come in a packet or on a single sheet. Select and apply the set of 
labels applicable to the country in which the equipment is being installed. 

2.7 RA90/RA92 Disk Drive Acceptance Testing Procedures 

The following sections cover RA90/RA92 disk drive acceptance testing procedures. Follow each 
procedure to completion before starting the next 

Refer to Figure 2-11 while performing acceptance testing on RA90 and RA92 disk drives. A more 
detailed description of the RA90/RA92 OCP and its functions can be found in Chapter 3. 
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Figure 2-11 RA90/RA92 Operator Control Panel 
2.7.1 Voltage Selection 

Before applying power to RA90 or RA92 disk drives, ensure the proper operating voltage has 
been selected for your area of operation. The voltage selector is a slide switch capable of selecting 
120 volts or 240 volts. (The frequency 60 Hz or 50 Hz is universal.) lb select the proper voltage, 
perform the following steps: 

1. Remove the cabinet rear access panel (refer to Section 2.4.1.2). 

2. Verify the ac circuit breaker on the power controller is off. 
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3. Verify the circuit breaker on each disk drive is off (0). 

4. Locate the voltage selector switch (Figure 2-12). 

5. Using a non-conductive pointed object, slide the voltage selector switch into the position 
applicable to your site. 
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Figure 2-12 Location of Voltage Selector Switch 
2.7.2 Applying Power to the Drive 

Use the following procedure to apply power to RA90/RA92 disk drives: 

1. Verify drive voltage selector switch has been properly set (see Section 2.7.1). 

2. Verify the ac circuit breaker on the power controller is off. Also verify the circuit breaker on 
each disk drive is off. See Figures 2-10 and 2-13 for circuit breaker locations. 
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Rgure 2-13 Location of Power Controller Controls— 881 Example 

3. Verify the Local/Remote switch on the 881 power controller is in the Local position. 

4. Verify the drive power cord is plugged into the power controller. 

5. Verify the external power source is correct. 

6. Plug the ac power cord from the power controller into an external power receptacle. 

7. Switch the ac circuit breaker on the power controller to the on position. 

8. Switch the ac circuit breaker on the RA90 or RA92 disk drive to the on position. 
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2.8 Power-Up Resident Diagnostics 

A sequence of drive-resident diagnostics run at power-up. The sequence consists of hardcore tests 
with basic processor tests. Successful completion of the hardcore tests is indicated by the following 
OCP displays: 

1. Blank (1 second) 

2. WATT (16 seconds) 

3. [0000] (If previously programmed, the drive unit number is displayed; otherwise, zeros are 
displayed.) 

2.8.1 OCP Lamp Testing 

Before continuing with acceptance testing, perform an OCP lamp test to ensure the LED state 
indicators and alphanumeric display are working properly. Perform the following procedure before 
selecting any other OCP switches (refer to Figure 2-11): 

1. Select the Test switch. The Test LED indicator lights. 

2. Select the Fault switch. All lamps light momentarily. 

3. Deselect the Test switch. 

All lamps should momentarily light. If not, ensure the OCP is seated properly and power is applied 
to the drive. Repeat the test. 

Replace the OCP if any lamps fail (refer to Section 6.7). 

2.8.2 Test Selection from the OCP 

It is necessary to select and run resident diagnostics from the OCP to complete acceptance testing. 
Use the following procedure to select and run diagnostics from the OCP. Figure 2-14 is a flowchart 
of this procedure. 

1. Power up the drive (if not done previously). 

2. Select the Test switch (test defaults to zero; no other operator action is required). 

3. Select the Write Protect switch. 

4. Select the diagnostic to run by using Port A and Port B switches. See the test selection 
flowchart (Figure 2-14). 

5. Start the test by selecting the Write Protect switch. 

6. Stop the test by selecting either the Port A or Port B switch. 

7. Restart the test by selecting the Write Protect switch again. 

8. Select the Test switch to exit the test mode. 

2.8.3 RA90/RA92 Idle Loop Acceptance Testing 

After the hardcore diagnostics have successfully run, the drive automatically enters an idle loop 
diagnostic test sequence. Do not select any front panel switches. Allow the drive to remain in the 
idle loop test for 5 minutes. 
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Figure 2-14 Test Selection Flowchart 
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If an error occurs during power-up or during idle loop testing, the drive attempts to display an error 
code. Table 2-1 lists error codes and required operator actions. Error codes not found in Table 2-1 
indicate a problem requiring additional troubleshooting. Refer to Chapter 5 for troubleshooting 
strategy. 

Table 2-1 OCP Error Codes 

Error Description Action 

OF Drive write protected Disable write protection with the OCP Write Protect switch or 

turn off software write protection. 

22 Drive over-temperature Spin down and remove power from the drive. Ensure the 

condition cabinet air vent grill is clean and room temperature is within 

recommended limits. Call Digital Customer Services if dirty air 
vent grill or temperature has not caused an over-temperature 
condition. 

2D Power supply over- Spin down and remove power from the drive. Ensure the 

temperature condition cabinet air vent grill is clean and room temperature is within 

recommended limits. Call Digital Customer Services if dirty air 
vent grill or temperature has not caused an over-temperature 
condition. 

3A, Write protect errors Disable write protection with the OCP Write Protect switch or 

6F turn off software write protection. 



2.8.4 Testing Spun-Down Drive 

To invoke resident diagnostics while the drive is still spun down: 

1. Select Test switch (Test indicator lights). 

2. Select the Write Protect switch: [T 00] is displayed. 

3. Input [T 60] into display. This is a loop-on- test utility. 

4. Start T60 by selecting the Write Protect switch a second time. The following occurs: 

[S.60] 
[LOT] 
[C.60] 
[T 00] (LSD flashing) 

5. Input [T 00] into display. 

6. Start TOO by selecting the Write Protect switch a second time. 

The drive is now running a sequence of resident diagnostics. A number of displays are seen during 
the execution of the diagnostics. These displays are normal. Examples of these displays are shown 
in Figure 2-15. 
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Figure 2-15 OCP Displays During Testing 

Allow drive tests to run for 5 minutes before continuing acceptance testing. To halt testing, select 
the Test switch (Test LED extinguishes). 

2.3.5 Testing SpuivUp Driv© 

lb spin up the RA90 or RA92 disk drive, select the Run switch. The Run indicator lights and an 
[R.~] appears in the display. Allow the drive to come to the ready state as indicated by the front 
panel Ready indicator. 

If either of the ports (A/B) are selected when the drive reaches the ready state, deselect the port 
switches, then proceed as follows: 

1. Select the Test switch. Test indicator lights. 

2. Select the Write Protect switch. [T 00] is displayed. 

3. Input [T 60] into display. This is a loop-on-test utility. 

4. Start T60 by selecting the Write Protect switch a second time. [LOT ] is displayed in the OCR 

5. Select the Write Protect switch. 

6. Input [T 00] into the display. 

7. Start TOO by selecting the Write Protect switch a second time. 

The above steps invoke a sequence of resident-diagnostic tests. The tests check drive functions in 
the following areas: 

Processor 

Servo bus 

Positioner 

Head select 

Read/write circuitry 

Fault detection circuitry 
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Allow the tests to run for 30 minutes to complete acceptance testing, then select the Test switch to 
exit the test mode. The lest LED extinguishes, an [R...] appears in the display and the Ready and 
Run indicators light. Additionally, if either port switch is selected, it will be displayed after the unit 
address: [RAB], 

If an error occurs during power-up or during the idle loop diagnostics, the drive attempts to display 
an error code. Table 2-1 lists error codes and required operator actions. 

If no problems are encountered, place the drive on line. 

NOTE 

In an HSC cluster environment, you can duplicate system usage by running TLttXttK for a 
few minutes; in a non-HSC environment, a successful operating system disk initialization 
and mount operation are sufficient for verifying subsystem operation. 

2.9 Placing the Drive On Line 

The following procedure assumes drive acceptance testing and cabling procedures have been 
completed. If not, refer to the appropriate sections of this manual for details. 

2.9.1 Programming the Drive Unit Address 

The unit address can be set once power has been applied to the drive. The unit address is 
programmable in the range of to 4094. | 

Enter the test mode to set the unit address. In the test mode, Port A and Port B switches have the 
added function of selecting both the unit address numbers and test numbers. 

After applying power, follow this procedure to set the drive unit address. Figure 2-16 is a flowchart 
of this procedure. 

1. Select the Test switch. The Test LED lights and zeros are displayed. (Something other than 
zeros may be displayed if the unit address has been previously programmed.) 

2. Select the Port A switch for the ones position. Position zero will blink. 

3. Select the Port B switch. Position zero will increment 1 through 9 for every time Port B is 
selected. 

4. Select the Port A switch for the tens position. Position one will blink. 

5. Select the Port B switch. Position one will increment 1 through 9 for every time Port B is 
selected. 

6. Select the Port A switch for the hundreds position. Position two will blink. 

7. Select the Port B switch. Position two will increment 1 through 9 for every time Port B is 
selected. 

8. Select the Port A switch for the thousands position. Position three will blink. 

9. Select the Port B switch. Position three will increment 1 through 4 for every time Port B is 
selected. 

10. Select the Test switch to exit the unit selection function. 



t The KDA50/UDA50/KDB50 support drive logical unit addresses only up to 255. 
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Figure 2-16 Unit Selection Flowchart 
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Before exiting, you will be prompted to verify that you want the unit number changed. The OCP 
displays the following prompt: 

CHG UNT # {? [N]} 

1. If you do not want to change the unit address, select the lest switch a second time. 

2. To change the unit address, proceed as follows: 

• Toggle the Port B switch. CHG UNT # {? [Y]} displays. 

• Select the Test switch. The old unit address will be overwritten, and the new unit address 
will be displayed in the OCP. 

NOTE 

The unit address number is written to EEPROM and is not lost if the drive loses power. 

2.1 Installing RA90/RA92 Add-On Disk Drives in 60-Inch Cabinets 

Information for unpacking and installing RA90/RA92 add-on disk drives into 60-inch cabinets can 
be found in product-specific documentation. Refer to the preface, About This Manual, for a list of 
related documentation. 
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Operating Instructions 



3.1 introduction 

This chapter describes each of the RA90 and RA92 disk drive components. Module compatibility 
tables are provided to explain the relationships between RA90 and BA92 disk drive hardware. 
Drive block diagrams are included to illustrate component relationships. 

This chapter also explains various operating modes of RA90/RA92 disk drives, and covers drive unit 
address programming, test functions, and mult functions. 

3.2 RA90/RA92 Disk Drive Components 

The main components of RA90 and RA92 disk drives are: 

• The electronic control module (ECM) 

• The preamp control module (PCM) 

• The blower motor assembly 

• The head disk assembly (HDA) 

• The drive power supply 

• The operator control panel (OCP) 

RA90/RA92 disk drives use three microprocessors to accomplish drive functions. The processors are 
the master (or I/O), the servo (or DSP), and the operator control panel (OCP) processor. 

Figure 3-1 shows a simplified block diagram of RA90/RA92 disk drives. 
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3.2.1 Electronic Control Module (ECM) 

The ECM field replaceable unit (FRU) consists of two modules back-to-back mounted on a slide 
carrier. One module contains the input/output-read/write (I/O-R/W) circuitry and is referred to as 
the I/O-R/W module. The second module contains the servo circuitry and is referred to as the servo 
module. 

Each module has a set of four physical jumpers that are hard-wired at the factory. These jumpers 
are ECO-controlled and are used to mark the differences in functionality between the two hardware 
versions of the ECM modules. (These jumpers allow the microcode to display the correct hardware 
revision codes for the I/O-R/W and servo modules when running drive utility T45.) 

The two 70-class ECM module set versions and related 54-class component part numbers are listed 
in Table 3-1. 

Table 3-1 ECM Module Types — Compatibility Matrix 

ECMP/N I/O-R/W P/N Servo P/N Comments 

70-22942-01 1 54-17771-01 54-17769-01 KA90 with HDA 70-22951-01 

or 70-27268-01 

70-22942-02 1 54-17771-02 54-17769-02 RA92-compatible 

1 The ECM FRU is available as a 70-class part. The individual 54-class parts are not field/customer available due to repair 
and error log history strategies implemented by Digital. 



The Digital circuit schematic (CS) revision alphanumeric marking on the ECM and its 54-class 
component modules does not reflect the microcode loaded into the non-volatile EEPROM as 
firmware code. This code is loaded in the field through the use of a microcode update cartridge. 
The microcode can then configure itself (enabled by the physical jumpers on the ECM modules) to 
assure the correct functionality of that particular ECM module. 

The functions of the I/O-R/W and servo modules are described in the sections that follow. 

3.2.1.1 I/O-R/W Module 

Functionally, the I/O-R/W module can be divided into three primary areas: SDI interface, control, 
and read/write. Figure 3-2 provides a block diagram of the I/O-R/W module. 

The control circuitry on this module contains the following: 

• MC6801 microprocessor (Master processor) 

• Memory (ROM and RAM) 

• Output control registers 

• Input status registers 
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The master microcode software controls drive functions through the control and status registers. 
Functional and diagnostic software for the master is stored in ROM, RAM, EEPROM and PROM 
memories. The master is the logic processor, and it controls and performs the following tasks: 

OCP communications 

Drive fault detection (including error recovery) 

Servo processor communications 

Functional servo microcode loading 

Standard disk interconnect (SDI) processing 

The master processor controls the servo processor through the use of software. The servo 
processor's response to master processor commands is also accomplished through the use of 
software. 

Upon power-up, the master processor (after self-testing the logic on the I/O-R/W) has the ability 
to test portions of the servo processor logic, including the servo processor RAM memory. After a 
successful test of the servo RAM, the master processor will execute a load of the functional servo 
microcode from the EEPROM located on the I/O-R/W module. 

RA90/RA92 disk drives are equipped with special error recovery circuits which the master processor 
controls. If the drive receives error recovery commands from the disk controller, the master 
processor software activates combinations of error recovery signals. As a result, drive read/write 
and servo characteristics are altered in an attempt to recover drive data. Appendix B contains a 
more detailed description of the RA90/RA92 error recovery mechanisms. 

The master processor retains the drive OCP switch state information and drive unit number in 
memory. This state information is saved into non-volatile EEPROM memory if a power loss is 
detected. Upon restoration of drive power, the original state of the drive can be resumed 

Functional microcode in the drive provides base level revision information concerning the I/O-R/W 
module. Drive utility T45 (refer to Chapter 4) displays a numeric number (decimal) code that 
translates to the module's hardware revision. The display format is [IOP=xx]. Table 3—2 presents 
the displayed codes and the corresponding module part numbers and revisions. 



Table 3-2 I/O-R/W Module — Hardware Revision Matrix 



T45 Displayed 
Revision IOP=xx 


I/O-R/W Module 
Part Number 


C/SPart 
Revision 


Etch 
Revision 


Compatibility 


00 


54-17771-01 


Lx-Nx 


E 


RA90only 


01 


54-17771-01 


Rx- xx 


F 


RA90only 


03 


54-17771-02 


Ax- 


F 


RA92-compatible 



3.2.1.2 Servo Module 

Figure 3-3 is a simplified block diagram of the servo portion of the ECM module. The servo portion 
of the ECM uses a digital signal processor of the Texas Instruments TMS family. The digital signal 
processor is called the servo processor (or sometimes the DSP processor). 

The servo processor communicates with the master processor and does the following: 

• Obtains embedded servo information from the I/O-R/W module for offset calibration of the 
read/write heads. 

• Obtains dedicated servo information for positioning the read/write heads. 
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• Controls spindle motor spin-up and spin-down operations. 

• Monitors HDA spindle speed and servo positioning (including errors). 

• Controls servo-related internal diagnostics. 
Additionally, the servo processor controls the following: 

• Retract (moving heads off data surface) 

• Return to zero (RTZ) 

• Fine track (keeping heads on track centerline) 



Functional microcode in the drive provides base level revision information concerning the servo 
module. Drive utility T45 (refer to Chapter 4) displays a numeric number (decimal) code that 
translates to the module's hardware revision. The display format is [SRV=xx]. Table 3-3 presents 
the displayed codes and the corresponding module part numbers and revisions. 



Table 3-3 Servo Module — Hardware Revision Matrix 



T45 Displayed 
Revision SRV=xx 


Servo Module 
Part Number 


C/SFart 
Revision 


Etch 
Revision 


Compatibility 


00 


54-17769-01 


Ax- Nx 


E 


RA90only 


01 


54-17769-01 


Px-xx 


F 


RA90only 


03 


54-17769-02 


Ax- 


F 


RA92-compatible 



3.2.2 Preamp Control Module (PCM) 

The PCM FRU is part of the HDA/carrier assembly which is also an FRU. Figure 3-4 is a simplified 
block diagram of the PCM. 

The PCM performs the following operations: 

• Decodes head select signals sent from the master to select the appropriate read/write head 
matrix chips (located inside the HDA), and the appropriate output from each matrix chip. 

• Monitors unsafe read/write conditions. 

• Provides differential write pulses to the preamplifiers. 

• Passes through the HDA vendor type bits from the HDA to the master processor. 

• Passes the type of format bits from the PCM switch pack to the master processor. 

Two different PCM modules exist in the RA90/RA92 disk drive family. The two PCM types are 
electrically incompatible in the interconnect between the PCM and the internal HDA electronics. 
However, the PCMs are functionally compatible between the PCM and internal HDA and the ECM 
variants that may be attached. A physical mechanism prevents the use of an incompatible PCM 
with an HDA. 

Table 3—4 describes the PCM switch pack settings with regard to the type of PCM, HDA and RA9x 
model. 
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Table 3-4 PCM Switch Pack Setup 









PCM SW Pack Settings* 




PGMP/N 


HDAP/N 


Sl- 


1 Sl-2 


Sl-3 


Sl-4 


Comments 


54-17758-01 


70-22951-01 














RA90 long arm only 


54-19724-01 1 


70-27268-01 





X 





1 


xvA90 snort arm 


54-19724-01 1 


70-27492-01 


1 








1 


RA92only 


54-19724-01 1 


70-27492-01 











1 


Incompatible setup 2 


54-19724-01 1 


70-27492-01 


1 


1 





1 


Incompatible setup 2 


54-19724-01 1 


70-27268-01 











1 


Incompatible setup 2 


54-19724-01 1 


70-27268-01 


1 


1 





1 


Incompatible setup * 



*0 = ON = CLOSED, 1 = OFF - OPEN 



1 PCM spares shipped from logistics are configured by default to declare an incompatible situation. This forces the field 
person to properly configure the replacement PCM to indicate the proper EDA format type. Hie drive microcode uses the 
switch setting information to properly configure servo operations. 

2 Drive LED error code CO signifies that the microcode has determined an incompatible situation between the hardware 
and/or microcode components of the drive configuration, or a hardware failure has caused the drive to believe the 
configuration is improper. 



Functional microcode in the drive provides base level revision information concerning the PCM 
module. Drive utility T45 (refer to Chapter 4) displays a numeric number (decimal) code that 
translates to the module's hardware revision. The display format is [PCM=xx]. 



Table 3-5 presents the displayed codes and the corresponding module part numbers and revisions. 



Table 3-5 PCM Module — Hardware Revision Matrix 



T45 Displayed 
Revision PCMasor 



PCM Module 
Part Number 



C/SPart 
Revision 



Etch 
Revision 



Compatibility 



00 
01 



54-17758-01 2 


Ex-Hx 


E 


HDA 70-22951-01 only 


54-19724-01 2 


Ax- 


A 


HDA 70-27268-01 and 
70-27492-01 



1 Switch position Sl-3 and Sl-4 of Switch Pack SI determine the displayed PCM hardware revision. 

2 These modules have a mechanical interlock that prevents the inadvertant mating of electrically incompatible PCMs to the 
HDA. 

There is a four-position switch pack on the PCM. Switch pack switches Sl-3 and Sl-4 determine 
the PCM hardware revision (not CS revision) through OCP display T45. Switches Sl-1 and Sl-2 
are used to tell the drive functional microcode the format type written on the HDA. There are two 
planned format types — RA90-compatible and RA92-compatible. 

A new HDA/carrier assembly FRU should have the switch pack set correctly by the manufacturing 
plant. If the PCM is defective, set the switch pack switches appropriately. Figure 3—5 shows the 
location of the switch pack on the PCM. 
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PCM SWITCH: O . OPEN/OFF . LOGICAL 1 
C = CLOSED/ON = LOGICAL 



CXO-2963A 



Figure 3-5 PCM Switch Pack Location 

3.2.3 Head Disk Assembly and Carrier Assembly 

Figure 3-6 is a simplified block diagram of the RA90/RA92 head disk assembly (HDA) and its 
relationship to the rest of the drive. 

The HDA consists of the following components: 

• The spindle motor, spindle, and recording media 

• The actuator motor to position the read/write heads 

• The Hall sensors to monitor spindle speed 

• The preamp/select chips 

• The brake assembly 

• The ground brush 

• The positioner lock mechanism 

Currently, there are three different HDAs in the RA9x disk drive products family. Two different 
PCMs are available for these three HDAs. Table 3S is a compatibility matrix for HDA types, PCM 
types, and RA9x models. 
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Table 3-6 RA90/RA92 HDA Hardware Compatibility Matrix 



HDAP/N 


RA90 


RA90 


RA92 


PCM 


and Type 


70-23899-01 1 


70-23899-02, 


7027490-01 


P/N 


70-22951-01, 


Original* 


Compatible 


Incompatible 


54-17758-01 


Long-arm RA90 HDA 










70-27268-01, 


Compatible 


Original* 


Incompatible 


54-19724-01 2 


Short^arm RA90 HDA 










70-27492-01, 


Incompatible 


Incompatible 


Original* 


54-19724-01 2 


Short>arm RA92 HDA 











*Original = HDA type original to drive 



1 The RA90 disk drive was originally made from the base drive part number 70-23899-01. With the introduction of the 
short-arm HDA, the variant of the base part number for the RA90 disk drive was changed. The size and SDI disk topology 
of the 70-23899-01 and 70-23899-02 variant RA90 disk drives are identical. There is not a duplication of drive serial 
numbers between the 70-class numbers. Architecturally, the drives are identical. At the HDA FRU level, the short-arm 
HDA is electrically compatible with the original long-arm HDA However, microcode compatibility issues must be watched. 



2 The PCM switch pack must be set to indicate the type of HDA 



3.2.4 Dual Outlet Blower Motor 

The blower motor assembly provides drive cooling. In addition, the blower motor contains speed 
control circuitry to activate higher throughput if the ambient air temperature exceeds 23°C (75°F). 
If the drive is operating without problems at or below this temperature, blower speed is reduced for 
better acoustic levels. 

3.2.5 Power Supply 

The power supply provides the following voltages to RA90/RA92 disk drives: 

• ±12 Vdc 

• ±5.1 Vdc 

• ±24 Vdc 

• -5.2 Vdc 

Normal power supply operation is indicated by the presence of a green Power OK LED located at 
the rear of the drive. Refer to Figure 3-7 for the location of the Power OK LED. 

The power supply operates on any line frequency within the range of 47 HZ to 63 HZ. It is switch- 
selectable to either of two ranges: 120 Vac or 240 Vac. 

CAUTION 

If a unit has its voltage selector switch in the 120 Vac position and is plugged into 240 

Vac, the power supply will be damaged. 

If a unit has its voltage selector switch in the 240 Vac position and is plugged into 120 
Vac, it may work, but would be very sensitive to low line voltage. 
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Figure 3-7 Power Supply OK LED 

This power supply has two vendors, designated Vendor A and Vendor B. Power supplies from 
Vendor A have a serial number with a CX site code. Power supplies from Vendor B have a serial 
number with a KB site code. (Voltage markings on some power supplies may read 115/230 Vac.) 
The power supplies from both vendors are functionally identical and carry the same Digital part 
number. 

3.2.6 Drive Functional Microcode 

The drive functional microcode can be field loaded and upgraded using the OCP microcode update 
port. ROM-based utility programs contained on the ECM module (I/O-R/W) allow microcode 
loading. 

Table 3-7 is a compatibility matrix for microcode cartridges, microcode levels, and ECM and HDA 
FRUs. 
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Table 3-7 RA90/RA92 Microcode Compatibility With Drive FRUs 





Micro- 




ECMFRU 




HDAFRU 




Microcode 














Cart. P/N 


code 


P/N 




P/N 


P/N 


P/N 


P/N 


and Rev 


Level 


70-22942-01 


70-22942-02 


70-22951-01 


70-27268-01 


70-27492-01 


70-24432-02 Al 


8 


Yes 




No 1 


Yes 


No 


No 


70-24432-02 Al 


9 


Yes 




No 1 


Yes 


No 


No 


70-24432-02 Bl 


10 


Yes 




No 1 


Yes 


No 


No 


70-24432-02 CI 


11 


Yes 




No 2 


Yes 


No 


No 


70-24432-02 Dl 


13 


Yes 




No 2 


Yes 


No 


No 


70-27950-01 Al 


20 


Yes 




Yes 


No 


Yes 


Yes 


70-27950-01 Bl 


25 


Yes 




Yes 


Yes 


Yes 


Yes 


1 Results in LED Code 13 














2 Results in LED Code £2 















NOTE 

Microcode compatible with an ECM FRU means the code can be loaded into the ECM 
FRU without error and will function, provided there is a compatible HDA with the 
appropriate PCM and PCM switch settings are correct. (This does not apply to Hard 
faults, because the microcode cannot be loaded into the ECM.) 

Microcode compatible with an HDA FRU, means that the code (when loaded into a 
compatible ECM) will support the HDA identified in Table 3-7. 

lb determine total compatibility, you must verify the following: 

- Code compatibility to ECM (Table 3-7) 

- Code compatibility to HDA (Table 3-7) 

- ECM compatibility to HDA (Table 3-1) 

- PCM and HDA compatibility (Table 3-4) 

- PCM switch pack setup (Table 3-4) 

3.2.7 OCP Functions 

The operator control panel (OCP) shown in Figure 3-8 functions as the interface to the RA90/RA92 
disk drive. The OCP performs the following functions: 

Selects and displays the unit address. 

Selects Run, Write Protect, Port A, and Port B. 

Displays fault indication and error codes. 

Selects tests in the test mode. 

Controls the drive software update process. 

Communicates with the RA90/RA92 master processor. 

Monitors momentary contact switches for closure. 
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Figure 3-8 RA90/RA92 OCP 

lb execute or select these functions, you must be familiar with the following OCF features (refer to 
Figure 3-8): 

• Six input switches (Run, Fault, Write Protect, Port A, Port B, and Test). 

• Seven LED indicators (Ready, Run, Fault, Write Protect, Port A, Port B, and Test). 

• A four-character alphanumeric display. 

• A software update port (refer to Chapter 7). 

3.3 RA90/RA92 Operating Modes 

RA90/RA92 disk drives operate in three setup modes: normal, fault display, and test. The following 
sections describe the function of each of these modes. 

3.3.1 Normal Mode Setup 

The normal mode setup is the usual operating mode of the RA90 and RA92 disk drives. Switch 
selection during normal operation usually consists of the Run switch, Write Protect switch (for 
normal write protection), and Port A or Port B switch. No Fault or Test indicators are lit. The 
switch states are displayed in the alphanumeric display, and the state of the drive relative to the 
controller is displayed in the LED indicators. 
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In the normal operating mode: 

1. Selecting the Run switch causes an R to appear in the OCP display and causes the drive to spin 
up. Additionally, the Run LED indicator lights. The Ready LED indicator lights once the drive 
is up to speed. 

2. Selecting the Port A or Port B switch causes an A or B to appear in the OCP display and 
logically makes the drive available to the controller. 

3. Selecting the Write Protect switch logically write protects the drive and lights the Write Protect 
LED indicator. 

4. Selecting the Fault switch: 

• (Without a fault indicator) causes a 2-second OCP lamp test. 

• (With a fault indicator) causes an error code to display. Selecting the Fault switch a second 
time (with a fault indicator) clears the fault. (Refer to Section 3.3.2.) 

5. Selecting the Test switch: 

• (With the Port A or Port B switch selected) causes a 2-second display of the unit address. 
(Refer to Section 3.4.1 for information on the alternate unit address display mode.) 

• (Without the Port A or Port B switch selected) causes the drive to enter the test mode. (At 
this time the Ready LED is extinguished.) 

Table 3-8 details operator actions and the result of OCP switch selection(s) in the normal mode. 
Power-up OCP functions and normal switch selection functions are covered. 

Table 3-8 Power-Up: Normal Mode Operations 

Operator Action OCP Result Drive Function 

Drive is running power-up diagnostics 

Unit number displayed may be something other than zero 

Spinup command issued to spindle 

Port A is enabled 

Port B is enabled 



3.3.2 Fault Display Mode Setup 

The fault display mode can only be entered if the Fault indicator is lit; otherwise, selecting the 
Fault switch causes a 2-second OCP lamp test. 

To enter the fault display mode, select the Fault switch. An error code is displayed in the format 
shown in Figure 3-9. To exit the fault display mode and clear the fault, select the Fault switch a 
second time. 

NOTE 

Hard faults will not clear. 

Figure 3-9 shows a characteristic alphanumeric fault display error code. Figure 3-10 is a fault 
display mode flowchart. 



<Power-up> 


[WATT] 


Default 


[OOOO] 


<RUN> 


[R...1 


<A> 


[RJL] 


<B> 


[FLAB] 
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Figure 3-10 Fault Display Mode Flowchart 
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3.3.3 Test Mode Setup 

You must enter the test mode to set the RA90 or RA92 disk drive unit address or to run resident 
diagnostic tests. In this mode, Port A and Port B switches have the function of selecting both the 
unit address numbers and test numbers. In addition, the port switches are used to abort running 
diagnostics. The Write Protect switch starts the tests and the Port A or Port B switch stops selected 
tests. 

The test mode is characterized by three displays. Figure 3-11 shows an OCP after test selection is 
made. Figure 3-12 shows a display while the test is running. 



DISPLAY = 



T 







1* 



* INDICATES FLASHING DISPLAY 

CXO-2192A 



Figure 3-11 OCP Display After Test Selection 
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Figure 3-12 OCP Display While Running Test 
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3.4 Programming the Drive Unit Address 

The unit address can be set once power has been applied to the drive. You must set the drive unit 
address before placing the drive on line. 

The RA90 or RA92 unit address is programmable from to 4094. (Note that the operating system 
or subsystem type can limit the unit address range.) 

Use the following procedure to set the drive unit address. (Refer to Figure 3-13 for a flowchart of 
this procedure.) 

1. Select the Test switch. (The Test LED indicator lights and a unit address (if previously 
programmed) is displayed; otherwise, zeros are displayed.) 

2. Select the Port A switch for the ones position. (Position zero blinks.) 

3. Select the Port 6 switch. (Position zero increments i through S every time Port B is selected.) 

4. Select the Port A switch for the tens position. (Position one blinks.) 

5. Select the Port B switch. (Position one increments 1 through 9 every time Port B is selected.) 

6. Select the Port A switch for the hundreds position. (Position two blinks.) 

7. Select the Port B switch. (Position two increments 1 through 9 every time Port B is selected.) 

8. Select the Port A switch for the thousands position. (Position three blinks.) 

9. Select the Port B switch. (Position three increments 1 through 4 every time Port B is selected.) 

10. Select the Test switch to exit. 

At this point, the OCP prompts you to verify that you want to change the unit address. The 
following prompt scrolls through the OCP display: 

CHG UNT # {? [N]> 

• If you do not want to change the unit address, select the Test switch a second time. The drive 
returns to normal mode. 

* Tb change the unit address: 

1. Toggle the Port B switch, CHG UNT # {? m» displays. 

2. Select the Test switch. 

The old unit address is overwritten with the new address. The new unit address is displayed in the 
OCP. 

NOTE 

The new unit address is written to EEPROM and is not lost if the drive loses power. 
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Figure 3-13 Unit Address Selection Flowchart 
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3.4.1 Alternate Unit Address Display Mode 

Future RA90 and RA92 disk drives will incorporate a microcode enhancement that will provide 
an alternate unit address display mode. To display the unit address, refer to Figure 3-14 while 
performing the following procedure: 

1. The OCP display shows an R, A, and/or B. 

2. While in normal mode, select the Port A and/or Port B switch. 

3. Select the Test switch. At this point, the Run, Fault, Write Protect, Port A, and Port B switches 
are disabled. 

4. The unit address is displayed until: 

• The Test switch is deselected. 

• Power is cycled. 

• An SDI HARD INIT occurs, or the drive forces a hard initialization due to a fatal error. 
Any of these conditions will clear the OCP from the alternate display mode. 
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Figure 3-14 Alternate Unit Address Display Mode Flowchart 
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4 

Drive-Resident Diagnostics and Utilities 



4=1 introduction 

This chapter describes drive-resident diagnostic fault detection, power-up and idle loop diagnostic 
routines, and sequenced or chained diagnostics. The RA90/RA92 drive-resident diagnostics and 
utilities are described individually. These drive-resident diagnostics test for and detect errors in the 
following field replaceable units (FRUs): 

* Electronic Control Module (EGM) (input/output-read/write (I/O-R/W) and servo modules) 

* Preamp Control Module (PCM) 

* Head Disk Assembly (HDA) 

4.2 Power-Up and Idle Loop Diagnostics 

Resident diagnostics execute any time the drive is powered up or the master processor is reset. 
Additionally, diagnostic routines execute during idle loop with the drive spun up or down. The Test 
LED, when lit, indicates the drive is in idle loop testing. 

The following sections describe power-up (reset) and idle loop diagnostic sequences. 

4.2.1 Power-Up (Hardcore) Diagnostics 

The following hardcore tests are run at power-up or upon reset of the master processor (refer to 
Section 4.7 for a description of each test): 

Master CPU test (POR) 

Master ROM test (T01) 

Master RAM test (POR) 

Master timer test (T02) 

Serial communication test (SCI) (POR) 

Servo data bus loopback test (T03) 

Servo RAM test (POR) 
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4.2.2 Idle Loop Tests (Drive Spun Down) 

Idle loop is defined as the drive being off line to the controller. The following sequence is executed 
every 30 seconds during idle loop (refer to Section 4.7 for a description of each test): 

Master ROM test (T01) 

Master timer test (T02) 

Servo data bus loopback test (T03) 

Head select test (T06) 

Sector/byte counter test (T07) 

SDI loopback test (internal) (T08) 

4.2.3 Idle Loop Tests (Drive Spun Up) 

The following tests are run during idle loop with the drive spun up (refer to Section 4.7 for a 
description of each test): 

Master ROM test (T01) 

Master timer test (POR) 

Servo data bus loopback test (T03) 

Head select test (T06) 

Gray code (track counter) test (T29) 

Guardband test (T30) 

Incremental seek test (quick verify mode) (T31) 

Random seek test (quick verify mode) (T33) 

4.3 Sequence Diagnostics 

A number of tests are sequenced together to form a chain of tests. The test [chain] numbers and 
the individual test numbers that make up the chain are listed here. An example of the information 
seen in the OCP alphanumeric display is also included. Refer to Section 4.7 for a description of 
each test. 

• TOO and T23 are the same when the drive is spun down, and include: 

T01 
T02 
T03 
T06 
T07 
T08 
Duration: 12 seconds 

* TOO and T22 are the same when the drive is spun up, and include: 

T01 
T02 
T03 
T06 
T29 
T30 
T31 
T33 
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T14 
T15 
T16 
Duration: 7:10 minutes 

The following is an example of the information seen in the alphanumeric display as the drive 
sequences through a chain: 

1. [T 00] Enter test TOO from the OCP front panel. 

2. [S 00] Start TOO. 

3. [S 01] T01 starts. 

4. [C 01] T01 completes. 

5. [5 02] T02 starts. 

6. [C 02] T02 completes. 

(and so on until each diagnostic in the chain is completed) 

7. [T 00] Concludes with this display and the least significant digit (LSD) blinking. The OCP 
display is read from left to right with the LSD on the right side. 

The majority of tests are of a relatively short duration, with the following exceptions: 

• T31 (2.5 minutes; indefinite when standalone) 

• T32 (1 minute; indefinite when standalone) 

• T33 (55 seconds; indefinite when standalone) 
Additional test chains are: 

• T18: T01, 02, 03, 06 (8 seconds total) 

• T19: T14, 15, 16 (20 seconds total) 

• T20: (4 seconds if spun down; 2 seconds if spun up) 

• T21: T03, 29, 30, 31 (4.5 minutes), 32, 33 (7:10 minutes total); error if spun down 

• T22: Same as TOO except T31 (4.5 minutes) (7 minutes total) 

• T23: T01, 02, 03, 06, 07, 08 (20 seconds total) 

4.4 Standard OCP Displays Indicating Procedural Problems 

If you attempt to load and run a nonexistent test, [INVL] (invalid) displays in the OCP, followed 
by an error code. For example, if you attempt to run T10 (an invalid test number), the following 
occurs: 

1. [T 10] (Display) 

2. [S 10] (Display) 

3. [INVL] (2 seconds — indicates invalid test) 

4. [C10] 

5. [T 10] 

No error code is generated, lb continue, simply select another diagnostic. 

If you attempt to run a diagnostic while the drive is faulted and that particular diagnostic cannot 
be run under fault conditions, the OCP displays [NRUN]. 
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For example, read/write or seek tests cannot be run while the drive is faulted. However, ROM or 
RAM tests can be run. 

If you attempt to run a test that requires the drive be spun up (but the drive is spun down), the 
following occurs: 

1. [T 14] Load T14. 

2. [S 14] Start T14. 

3. [T 14] (with fault light) 

4. Select Fault switch. 

5. [E.CA] error code indicates the drive must be spinning for the test to run successfully. Unless 
otherwise indicated, this is the format for all errors. 

Select the Fault switch again to clear the fault and continue. 

If you attempt to run a test that requires the drive be spun down (but the drive is spun up), the 
following occurs: 

1. [T 07] Load T07. 

2. [S 07] Start T07. 

3. [T 07] (with fault light) 

4. Select the Fault switch. 

5. [E 7B] Invahd-test-while-drive-is-spinning error. This is the format for all tests that are invalid 
while the drive is spinning. 

Select the Fault switch again to clear the fault and continue. 

Some diagnostic test numbers call up other tests. These are displayed in the OCP after the 
diagnostic starts. An example of this is T24. The following is displayed in the OCP: 

1. [T 24] Load T24. 

2. Start test. 

3. [S 63] See test T63. 

4. After the head(s) are selected, select Write Protect. 

5. [T 31] Loaded by the drive. 

6. [S31]SeetestT31. 

The reverse is not true. T63 does not start T24. 

4.5 Software Jumper 

References to a software jumper are frequently made throughout the discussion of diagnostics. 
To use the software jumper, simply select the Run/Stop switch within 1.5 seconds of starting a 
diagnostic requiring the jumper's use. 

CAUTION 

Do not use the jumper unless it is required. Valuable drive component information can 

be accidentally lost. Use the jumper only when instructed to do so. 
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Master RAM Test (POR/SDI) 

4.6 Temperature's Affect on Drive Performance 

The RA90/RA92 drive utilities T36, T38, and T39 measure various seek time parameters. Compare 
measured times to drive specifications in cases where seek time is in question. 

At the areal densities of the RA90/RA92 disk drives, variations of mechanical responses within 
the HDA mechanical structures change significantly over the wide temperature ranges acceptable 
to the drive. lb control these variations and their impact on the subsystem, the drive monitors 
and compensates its seek profile to optimize the seek time performance. This compensation is a 
dynamic process, that assures top seek performance of the disk drive. 

4.7 Diagnostics Descriptions 

This section describes each of the diagnostic tests and utilities resident in the RA90/RA92 disk 
drive. Teste are listed by a test number (where applicable), a name, an explanation of how the test 
is invoked, and a test description. 

Conventions include the following: 

• (TOO): test number 

• (POR): power-up or reset 

• (SDI): initialization performed by the controller over the SDI cable 

• ([0000]): items enclosed in square brackets represent the OCP alphanumeric display 

NOTE 

Some diagnostics implement a scrolling display pattern, lb stop the scrolling display 
pattern, select the Run switch; this halts the display until you are ready to continue. 
Select the Run switch again to continue the display. 

Some tests run for several seconds then have results to display. These tests stop the 
scrolling display and send an asterisk to the display. Press the Run switch to display test 
results. 



Master CPU Test (POR) 

The Master CPU test verifies the basic functions of the drive master processor. Accumulator 
functions, conditional codes, and other MCU chip functions are tested. 



Master RAM Test (POR/SDI) 

The Master RAM test runs at power-up only. It verifies the master processor internal and static 
RAM. The test reads and writes, then reads each RAM location again to verify data integrity of the 
component. The test is executed in both forward and reverse directions. 
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(T03) Servo Data Bus Loopback Test (POR/SDI) 



Serial Communications Interface (SCI) Test (POR) 

The SCI test checks the master processor serial communication interface by looping a data pattern 
from the serial output back to the serial input. It compares data out to data in for integrity. 
Additionally, the serial port is tested for overrun error detection and overrun recovery. The test 
simulates OCP MCU communication with the master MCU. 



Servo RAM Test (POR) 

The Servo RAM test checks the servo processor RAM by writing a pattern of ones and zeros through 
RAM. The entire 16 Kbytes of RAM is tested. 



(T01) Master ROM Test 

The Master Processor ROM test verifies the master processor internal ROM, EEPROM, and the 
associated address decode logic. A checksum is done on each ROM. Next, the test verifies that the 
consistency codes match between the MCU ROM and the master processor EPROM and EEPROM. 
If a failure occurs, the master processor attempts to display an error code to the OCP. 



(T02) Master Timer Test (POR/SDI) 

The Master Timer test verifies the output compare timer in the master processor by checking the 
Output Compare Flag (OCF) for stuck bits. Additionally, the test operates the timer in polling and 
interrupt modes. 

In polling mode, the output compare register generates a compare every 50 ms and ensures that 
the OCF sets within 60 ms. 

In interrupt mode, the output compare register generates a compare every 50 ms and checks for 
one interrupt within a 75 ms period. 

(T03) Servo Data Bus Loopback Test (POR/SDI) 

The Servo Data Bus Loopback test checks the data bus interface between the I/O-R/W module and 
the servo module (ECM) by rotating a single bit through each bit position on the servo data bus. 
The data pattern is written to the GASP register #1 and read back through GASP register #7. 
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(T04) Drive S/N Bus Test (POR/SDS) 



The Drive S/N Bus test checks the drive serial number bus (the rear flex cable between the 

and the servo module with the PCM switch pack). The three drive ID ports hardwired on the rear 

flex cable assembly are read and concatenated into one 20-bit binary encoded serial number. 

Bits 19 and 18 represent the manufacturing plant code of the drive (GX or KB). Bits 17 through 
are the alphanumeric serial number (00000-Z9999). The numbering scheme is displayed using the 
following: 



Encoded Serial 
Number Displayed 



Decimal Drive 
Serial Number 



00000-99999 

A0000-A9999 

B0000-B9999 

C0000-C9999 

D0000-D9999 

E0000-E9999 

F0000-F9999 

H0000-H9999 

J0000-J9999 

KD00O-K9999 

L0000-L9999 

M0000-M9999 

N000O-N9999 

P0000-P9999 

R000O-R9999 

S000O-S9999 

T0OOO-T9999 

U0000-U9999 

V0O0O-V9999 

W0000-W9999 

Y000O-Y9999 

Z0O00-A9999 



0-99,999 

100,000-109,999 

110,000-119,999 

120,000-129,999 

130,000-139,999 

140,000-149,999 

150,000-159,999 

160,000-169,999 

170,000-179,999 

180,000-189,999 

190,000-1999,999 

200,000-209,999 

210,000-219,999 

220,000-229,999 

230,000-239,999 

240,000-249,999 

250,000-259,999 

260,000-269,999* 

270,000-279,999 

280,000-289,999 

290,000-299,999 

300,000-309,999 



*NOTE 

U2143 is the maximum serial number that can be coded for the KB manufacturing site 

because only the bottom 18 binary bits are used for the serial number range. 
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(T07) Sector/Byte Counter Test 

The test passes or fails based on the following valid and invalid bit-encoded binary information: 

VALID DRIVE S/N CODES 
19 18 

CXO-built drive (serial 

number 1 through 262,143) 

1 CXO-built drive (serial 

number 262,144 through 309,999) 
Limitation is based upon 
the number of alphabetic 
characters available. 

1 KBO-built drive (serial 

number 1 through 262,143) 

INVALID DRIVE S/N CODES 

BITS MIN BINARY VALUE MAX BINARY VALUE 
19 18 BITS<17:00> BITS<17:00> 



1 001011101011110000 111111111111111111 

1 1 000000000000000000 111111111111111111 



NOTE 

Do not alter these switches in the field unless you are instructed to do so during an 

ECO/FCO installation. 



(T06) Head Select Test 

The Head Select test checks the SDI gate array (SGA) head select register for stuck-at conditions. 
The test writes a head select pattern to an SGA internal register and verifies the pattern by reading 
it back through another SGA internal register. Each head select pattern is clocked to the preamp 
control module (PCM) verifying the correct head select chip can be enabled. 



(T07) Sector/Byte Counter Test 

The Sector/Byte Counter test checks the sector preset by writing and reading each bit in the sector 
preset register. The test checks the byte preset counter by presetting the byte counter. A full 
counting sequence is needed to increment the sector count by one. Finally, the sector/byte counter 
is checked with the actual preset values used in the functional code. A diagnostic clocking signal is 
used. 
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(T08) SDI Loopback Test (internal) 

The internal SDI Loopback test is executed with the TSID GATE ARRAY in loopback mode. 

The State Frame part of this test asserts the state bits (RDY, ATN, R/W, SEC) in the Real Time 
Drive State (RTDS) frame and checks the corresponding state bits (RDY, WRT, RD, INI) in the Real 
Time Controller State (RTCS) frame for accuracy. 

The Response Serializer part of this test sends framing codes (START, CONTINUE, END) by way 
of the CMDREG register, along with response data (a pattern) by way of the RSPDAT register. THie 
test checks the correct framing codes by way of the INSTR2 register and the correct command data 
through the CMDATA register. This test is executed on Ports A and B. 



(T09) SDI Loopback Test (External) 

The external SDI Loopback test is the same as the internal SDI Loopback test except the SDI 
signals are looped back via connectors at the end of the SDI cables. See Figure 4-1. 



(T14) Read-Only Test 

The Read-Only test compares prerecorded data information from cylinder 2659 to the data read 
by each head. The data pattern is different for each head. If the compare fails, an error code is 
generated. In addition, if five off-track errors are detected while reading with any one head, an 
error is generated. Errors are analyzed in the following manner: 

* A sector is considered bad if the same sector fails to read the correct data three out of five 
times. 

• A head is considered bad if the same head contains nine bad sectors. 

If no errors are detected during this test, a compare error is induced to ensure that the IID chip 
compare rireuitrv can detect a comnare error. 



(T15) Write/Read Test 

The Write/Read test executes only after the read-only (T14) test has passed. This test writes and 
reads dedicated cylinder 2660 using all read/write heads. 

Two patterns are used during this test: 

1. First, all the heads are written with an all-zeros-plus-a-SYNC-BIT pattern and read to verify 
that the data compares. If there are no errors, a NO SYNC detection test is run verifying that 
the IID sync detection circuitry is working correctly and that it can detect a NO SYNC error. 

2, Second, a ones-plus-a-SYNC-BIT pattern is written to cylinder 2660 and read back using each 
data head. Data is compared to ensure data integrity. 
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(T15) Write/Read Test 
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I/O BULKHEAD 



CXO-2144A 



Figure 4-1 Using Loopback Connectors 
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(T21) Total Servo Sequence Test 



(T16) Read/Write Force Fault Test 
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read/write faults. 



(T17) Read-Only Cylinder Formatter 

Read-only cylinder 2659 is written with a zeros-plus-a-SYNC-BIT pattern (all heads) and read 
back to verify data. Then another pattern is written and read back, and the data is compared for 
accuracy. This cylinder is not formatted by any other subsystem formatter. 

NOTE 

Use a software jumper to execute this utility. This protects the stored information from 

unintentional clearing. Refer to Section 4.5. 

Reformatting this cylinder is sometimes necessary in the field. 



(T18) Hardcore Sequence Test 

This sequence diagnostic consists of T01, 02, 03, and 06. Duration: 20 seconds. Drive may be spun 
up or down. 



(T19) Read/Write Sequence Test 

The drive must be spun up to run the Read/Write sequence test. This sequence diagnostic consists 
of T14, 15, and 16. Duration: 25 seconds. 



(T20) Servo Spinup Sequence Test 

SeeT03. 



(T21 ) Total Servo Sequence Test 

The drive must be spun up to run the Total Servo Sequence test. This sequence diagnostic consists 
of T03, 29, 30, 31, 32, and 33. Duration: 4.5 minutes. 
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(T22) Total Drive Sequence Test (Spinning) 

The drive must be spun up to run this test. This sequence diagnostic consists of T01, 02, 03, 06, 29, 
30, 31, 33, 14, 15, and 16. Duration: 7 minutes. 

(T23) Total Drive Sequence Test (Spun down) 

The drive must be spun down to run this test. This sequence diagnostic consists of T01, 02, 03, 06, 
07, and 08. Duration: 20 seconds. 



(T24) Head Select and One Seek Test Sequence 

SeeT63. 

(T28) Drive-Sensed Temperature Display Utility 

This utility was implemented with version 25 of the drive microcode to display the drive-sensed 
temperature in degrees Fahrenheit, in a scrolling display on the OCR Version 26 of the microcode 
displays this temperature in degrees Fahrenheit and Celsius. 

The OCP scrolling display is as follows: 

[ *TEMP=xxxF/xxC*] 

(T29) Gray Code (Track Counter) Test 

The Gray Code test checks that the correct gray code is generated from the two least significant 
bits of the track counter as the drive seeks from cylinder to 3 and 3 to 0. This test is executed on 
the dedicated servo surface only. 

(T30) Guardband Test 

The Guardband test checks the drive's ability to find inner and outer guardbands during seeks to 
these areas. 
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(T31) incremental Seek Test 

The Incremental Seek test exercises the servo by seeking between two cylinders using an 
incremental seek pattern. The starting cylinder, ending cylinder, and incremental value can be 
default or user defined. 

Default seek parameters are: starting cylinder 0, ending cylinder 2655 (last data cylinder), and an 
incremental value of 1. 

An example of the seek algorithm using the default seek parameters: 

BEG: O-i-2-3-4-5- // 2653=2654=2655 :END 



(T32) Toggle Seek Test 

The Toggle Seek test does repetitive seeks between two cylinders. The starting and ending cylinders 
can be user defined or default cylinder addresses. 

Default seek parameters are: starting cylinder and ending cylinder 2655 (last data cylinder). 

An example of the seek algorithm using the default seek parameters: 

BEG: 0-2655-0-2655-0-2655- etc.., :END 



(T33) Random Seek Test 

The Random Seek test does repetitive seeks between two cylinders. The starting and ending 
cylinders can be user defined or default cylinder addresses. 

Default seek parameters are: starting cylinder and ending cylinder 2655 Qast data cylinder). 



(T34) Tapered Seek Test 

The Tapered Seek test exercises the servo by seeking between two cylinders using a tapered seek 
pattern. The pattern starts at the cylinder with the longest stroke and ends at the cylinder with 
the shortest stroke. 

The starting and ending cylinders can be user defined or default cylinders. 

Default seek parameters are: starting cylinder and ending cylinder 2660 (diagnostic write 
cylinder). 

This example has the reference cyl=0 and ending cyl=2660: 

BEG: 0-2660-0-2659-0-2658-0-2657-0-2656- etc. 0-6-0-5-0-4-0-3-0-2-0-1-0 :END 

This example has the reference cyl=2660 and ending cyl=2660: 

BEG: 2660-0-2660-1-2660-2-2660-3-2660-4 etc. 2660-2658-2660-2659-2660 :END 

This example has the reference cyl=1330 and ending cyl=2660: 

BEG: 1330-2660-1330-2659-1330-2658 etc. 1330-1332-1330-1331-1330 :END 
BEG: 0-1330-1-1330-2-1330-3-1330-4 etc. 1327-1330-1328-1330-1329-1330 :END 
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(T38) Average Seek Timing Test 

4.7.1 Seek Timing Tests 

The following diagnostics are classified as seek timing tests. Seek timing tests can be executed 
through the OCP or through the SDI level 2 DIAGNOSE command. 

At the completion of a timing test, position three is blank, positions two and one contain a timing 
test acronym (MH, MX, AV, HD), and position zero contains an asterisk (*). 

At this point, the results can be displayed. A scrolling message display reports the test results to 
the user. The message is scrolled, one character at a time, starting at the right side of the OCP 
and continuing off to the left side of the OCP. The Run switch is used to start and stop the scrolling 
display by pressing it once to start the display, and once to stop the display. 

All the timing tests use a 1-microsecond clock to calculate seek times. Because of this, the short 
seek and head switch times are not as accurate as the long seek times. 

(T36) Minimum Seek Timing Test 

This test executes the minimum seek timing algorithm and displays the results of the test in the 
OCP. Test time is approximately 75 seconds. 

The following scrolling message format is used to display test results: 

[MIN TIM FWD=xx.xMS] 
[MIN TIM REV=xx.xMS] 

where xx.x is the seek time (in milliseconds). The minimum seek time is defined as the average 
of 2655 single cylinder seeks (forward and reverse). This test uses the default incremental seek 
pattern. 

NOTE 

If the time exceeds 99.9, the decimal point is shifted one digit to the right. The OCP 

displays [999]. 

(T38) Average Seek Timing Test 

This test executes the average seek timing algorithm and displays the test results to the OCP. This 
test takes 5-7 minutes to complete. 

The following message is scrolled across the OCP display: 

[AVG TIM FWD=xx.xMS] 
[AVG TIM REV=xx.xMS] 

where xx.x is the seek time (in milliseconds). The average seek time is denned as the average of 
512 one-third-length seeks. For the RA90 disk drive, the seek length is 855 cylinders. For the RA92 
disk drive, the seek length is 1035 cylinders. 

Average seek time: < 21 milliseconds for RA90. 

Average seek time: < 19 milliseconds for RA92. 



DIGITAL INTERNAL USE ONLY 



Drive-Resident Diagnostics and Utilities 4-15 
(T40) Update Cartridge Utility (Spun Down) 



(T39) Head Switch Timing Test 

This test executes the head switch timing algorithm and displays the test results to the OCR 
This test takes approximately 2 seconds to run. The following message is scrolled across the OCP 
display: 

[HD SWT TEME=xx.xMS] 

where xx.x is the head switch time (in milliseconds). 

The head switch time is denned as the summation of all possible head switches divided by the total 
number of head switches. 



(T40) Update Cartridge Utility (Spun Down) 

The drive must be spun down to run the Update Cartridge utility. 

This internal microcode update utility is used in the field to update the following internal drive 
microcode functions: 

* Diagnostics microcode 

* Servo microcode 

* Functional microcode 

New microcode is loaded in the following sequence: 

1. Load update cartridge into update port. 

2. Load test T40. (Drive must be spun down.) 

3. Start test T40. The following occurs in the OCP display once this test has begun (S = start, 
P = pass, C = complete): 

[3 40] (2 seconds). 

IP 1] (20 seconds) Pass one checks PROM to be loaded. 

[P 2] (20 seconds) Pass two writes the new code into the even pages in EEPROM. 

[P 3] (20 seconds) Pass three writes the new code into the odd pages in EEPROM. 

[C 40] (1 second) Update is complete. 

[WAIT] (10 seconds) Exits test mode and goes through power-up hardcore sequence. 

[0000] Returns to display the drive unit address. 
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(T41) Display Error Log Errors 

This utility displays the RA90/RA92 drive-resident error log. When initiated, it first verifies the 
integrity of the error log by reading the first four bytes of the error log header and comparing them 
to expected values. If the compare fails, the utility exits and an error code displays. 

The error log is displayed starting with the latest entry first and continuing until all entries are 
displayed. Positions three and two represent the error log entry in decimal. Positions one and zero 
represent the two-digit LED hex error code. Each entry is displayed for 1.5 seconds with the option 
of starting and stopping the display using the Run switch. 

NOTE 

Null entries are displayed as 00 and should be ignored. 
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(T43) Display Seeks Utility 

4.7.2 Time, Seeks, and Spinups Display interpretation 

The time, seeks, and spinups display utilities all use the following format to display the counts to 
the OOP: 

POSITION 3 2 10 

OCP1 



OCP2 



OGP3 



vur 4 



OOP 5 



OCP6 



X 


X 


9 


8 










7 


6 


5 


4 










3 


2 


1 














CXO-2146B 

The following conventions are used: 

TM = time 
SK = seeks 
SP = spinups 

OCP 1 contains either TM, SK, or SP, and the binary digits 9 and 8. 

OOP 3 contains binary digits 7, 6, 5, and 4. 

OCP 5 contains binary digits 3, 2, 1, and 0. 

OCP displays 2, 4, and 6 are used as separators to indicate the display is changing. 



(T42) Display Time Utility 

A 10-digit decimal number representing time is displayed when this utility is run. This number is 
time, in minutes, since the drive was first powered up. See Section 4.7.2 for display interpretation. 

(T43) Display Seeks Utility 

When this utility is run, the OCP displays the number of total seeks (times a thousand) since the 
drive was first powered up. A 10-digit decimal number is displayed in six segments at the OCP. 
Each segment is displayed for 1.5 seconds unless the display is halted by selecting the Run switch. 
See Section 4.7 2 for display interpretation. 
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(T44) Display Spinups Utility 

This utility displays the total number of spinups since the drive was first powered up. When 
this utility is run, the total number of spinups is displayed on the OOP in six segments. Each 
segment is displayed for 1.5 seconds unless the display is halted by selecting the Run switch. See 
Section 4.7.2 for display interpretation. 



(T45) Drive Revision Level Utility 



This utility uses the following mnemonics to display drive component hardware and/or microcode 
revisions as follows: 

DRV = Drive hardware revision 

DCD = Drive microcode revision (microcode) 

IOP = Master processor module (hardware) 

SRV = Servo module (hardware) 

PCM = Preamp control module (hardware) 

ORV = Operator control panel (hardware) 

OCD = Operator control panel (microcode) 

Running this utility displays the revision level for each module in a scrolling message format across 
the OCR The following scrolling message format is used to display the information to the drive 
OCP: 

• DRV= www 

where www is the decimal hardware revision (0 to 255) of the drive. 

• DCD= yyy 

where yyy is the decimal revision number (0 to 255) of the combined functional, servo, and 
diagnostic microcode. The microcode is loaded from the microcode update cartridge. 

NOTE 

If a drive microcode revision (in the OCP display) contains an alpha character, for 
example, DCD=L200, this signifies unreleased code. The drive microcode should he 
updated with a formally released microcode revision. 

• IOP= xx 

where xx is the decimal revision number (0 to 15) of the appropriate module. 

• SRV= xx 

• PCM= xx 

• ORV= xx 

• OCD= z.z 

where z.z is the decimal revision number (0.0 to 9.9) for the OCP microcode. 

NOTE 

If the OCD is displayed as version 5.1 (OCDa 5.1), the drive has an OCP that allows 

the alternate unit address display mode to he used. Refer to Chapter 3. 
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4-19 



The hardware revision switches (Figure 4-2) provide the subsystem with the ability to determine 
base-level module revision compatibilities. The hardware switches are changed only by direction of 
a drive ECO/FCO. All ECO and FCO activity will take into account the impact of the changes to 
the drive and to the subsystem to which it is attached. 




RA90/RA92 DRIVE 
CHASSIS FRONT 



FOUR-POSITION 
HARDWARE 
REVISION 
DIP SWITCH 



CXO-2147B 



Figure 4-2 Hardware Revision Switches 

NOTE 

Do not alter these switches in the field unless you are instructed to do so during an 

ECO/FCO installation. 

The hardware revision switches make up only part of the total reported hardware revision. The 
total reported hardware revision is a byte of information determined as shown in Figure 4r-3. 
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7 6 5 4 3 2 10 



BITS 



HARDWARE REVISION SWITCHES 

INDICATES SERVO SYSTEM IMPLEMENTED IS: 

00 = DEDICATED (ONLY SERVO SYSTEM) 

01 - EMBEDDED (BLEND) SERVO SYSTEM 

HDA CONFIGURATION 

00 = RA90 LONG-ARM HDA (P/N 70-22951-01) 

01 = RA90 SHORT-ARM HDA (P/N 70-27268-01) 

10 = RA92 HDA (P/N 70-27492-01) 

1 1 « NOT USED 

CXO-2716B 



Figure 4-3 Hardware Revision Byte 



(T46) HDA Revision Utility 

This utility allows you to display the HDA revision/vendor bits in the OCP display. The first year of 
production will reflect HDA revision/vendor bit 0. 



OCP1 



V 


N 


= 


n 



CXO-2148B 



The two left-most places of the OCP display contain a VN for the vendor code. The right-most place 
of the OCP display contains a vendor code of through 3. These revision/vendor bits are used to 
distinguish the HDA type to the drive microcode. These bite, in conjunction with PCM switches 
Sl-1 and Sl-2, tell the microcode how the servo system should be configured in microcode. 



(T47) Display Drive Serial Number Utility 

This utility displays the drive serial number to the OCP. 

The following message is scrolled Qeft to right) across the OCP display. 



[DRV S/N xxy_zzzz] 



where: 



- xx is the manufacturing location of the drive (CX=CXO, KB=KBO) 

- y is the alphanumeric digit 0-9 or A-Z (G, I, O, Q, and X are not allowed) 

- zzzz is 0000-9999 
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(T60) Loop-On-Test Utility 



(T50) Error Log Checkpoint Utility 

The nifTOr J-iOg v/ucC&.p01Iib iiwiii-y auOWS jrOii vO 6ub6r 3. Cu@Cn.pClHi> €ui*y lIlvC ui€ Uj.v€ITi£u uTIVC 

error log. This is similar to a place marker. 



(T53) Clear Seeks Utility 

The Clear Seeks utility clears the total number of seeks since the drive was first powered up. Run 
this test any time the HDA is replaced. 

NOTE 

Use a software jumper to execute this utility. This protects the stored information from 

unintentional clearing. Refer to Section 4.5. 

This test causes UNVL] to be displayed if you fail to use the software jumper. 



(T54) Clear Spinups Utility 

The Clear Spinups utility clears the total number of spinups since the drive was first powered up. 
Run this test any time the HDA is replaced. 

NOTE 

Use a software jumper to execute this utility. This protects the stored information from 

unintentional clearing. Refer to Section 4L5. 

This test causes UNVL] to be displayed if you fail to use the software jumper. 



(T55) Clear DD Bit Utility 

The Clear DD Bit utility clears the DD bit set by the diagnostics or a controller. 



(T60) Loop-On-Test Utility 

This utility enables looping on a test. It can be set to loop on a diagnostic test or a diagnostic 
sequence of tests. [LOT] is displayed on the OCP for 1.5 seconds when the loop utility is run. The 
utility loops until an error is encountered or until the Test switch on the OCP is selected. 



LOT 



OCP1 

CXO-2149A 
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4-22 Drive-Resident Diagnostics and Utilities 
(T63) Head Select Utility 



(T61) Loop-On-Error Utility 

This utility loops continuously on errors encountered during the execution of drive internal 
diagnostics. The test loops as long as the error is present. [LOE] is displayed on the OCP for 
1.5 seconds when the loop utility is run. 



(T62) Loop-Off Utility 

The Loop-Off utility terminates all loop-on conditions. [LOF] is displayed on the OCP for 1.5 
seconds when this utility is run. 



OCP1 



LOF 



CXO-2150A 



The effects of the LOT or LOE utilities may be canceled manually (LOF) or by exiting OCP test 
mode and letting the idle loop routine execute at least one time. 



(T63) Head Select Utility 

The Head Select utility allows you to select or change the head to be tested. When the utility is 
first run, the currently selected head number is displayed in decimal (0-12) in the OCP display, and 
the least significant digit (LSD) blinks. 

The format is as follows: 



OCP 1 



H 


a 









CXO-2151A 



The head number may be changed by selecting the Port B switch to increment the blinking digit. 

When the desired head number is displayed in the OCP, pressing the Write Protect switch causes 
that head to be selected and the head number to be changed in RAM. If the Test switch is pressed, 
the test is aborted and the change does not take place. 

The head remains selected until changed by this utility, power-up or reset, I/O processor reset, SDI 
INIT, or controller intervention. 
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(T65) Seek Parameter Input Utility 



(T64) One Seek Utility 



The One Seek utility can be used to seek and lock on a cylinder. When run, the following OCF 
display is seen: 



OCP1 



OOP 2 



C Y L 

X X X X 



(CYLINDER 1.5 SEC) 

(CYLINDER VALUE: 0-2660 SEC) 
CXO-2152A 



The right-most digit blinks to indicate cylinder value selection can begin. Selecting the Port A 
switch selects the next desired digit position which starts blinking upon selection. Digit position 
is from right to left (LSD to MSD). A wrap back to the LSD occurs if the Port A switch is selected 
enough times. Selecting the Port B switch increments the blinking digit. 

After the cylinder value is set, select the Write Protect switch to cause the heads to position 
themselves at the desired cylinder. Selecting the lest switch aborts the process without changing 
the cylinder value. 

The selected cylinder value is stored in RAM until T64 is run again or a power-up reset, master 
processor reset, or SDI INIT occurs. 



(T65) Seek Parameter Input Utility 

Four seek parameters can be examined or changed when using the seek timing tests T36, T38, and 
T39. They are: 

• FCY (first cylinder) 

• LCY (last cylinder) 

• INC (increment) 

• DLY (delay) 

Seek parameters are changed the same way as the seek utility parameters. Refer to tests T36, T38, 
and T39 for a discussion on altering parameters for diagnostics. 

The following describes the sequence of events which occur when test T65 is run: 

FCY= is the first display seen when this utility is started (Figure 4-4). The first cylinder value 
follows 1.5 seconds later. The FCY can be any decimal number between and 2660. 



OCP1 



OCP2 



F C Y 

X X X X 



(FIRST CYLINDER VALUE 1.5 SEC) 

(DESIRED VALUE: 0-2660 SEC) 

CXO-2153A 



Figure 4-4 T65 FCY OCP Display 
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(T65) Seek Parameter Input Utility 

Next, select the Write Protect switch. LCY= is displayed (Figure 4-5). The last cylinder value 
follows 1.5 seconds later. The LCY can be any decimal value between and 2660. 



OCP 1 



OCP2 



L C Y 

X X X X 



(LAST CYLINDER VALUE 1.5 SEC) 
(DESIRED VALUE: 0-2660 SEC) 

CXO-2154A 



Figure 4-5 T65 LCY OCP Display 

Select the Write Protect switch again. LNC= is displayed (Figure 4-6). The incremental value 
follows 1.5 seconds later. The INC value can be any decimal number between 1 and 2660. If a 
value of is chosen, the test loops indefinitely. 



OCP 1 



I 


N 


c 


= 


X 


X 


X 


X 



(CURRENT INCREMENT VALUE 1.5 SEC) 



(DESIRED VALUE: 0-2660 SEC) 



CXO-2155A 



Figure 4-6 T65 INC OCP Display 

Select the Write Protect switch and DLY= is displayed (Figure 4-7). The delay value between seeks 
is displayed 1.5 seconds later. A delay value can be between and 2999 milliseconds. 



OCP 1 



OCP 2 



D L Y 

X X X X 



(CURRENT DELAY VALUE 1.5 SEC) 



(DESIRED VALUE: 0-2999 SEC) 



CXO-2156A 



Figure 4-7 T65 DLY OCP Display 

The seek parameters remain changed until this utility is run again or a power-up reset, I/O 
processor reset, or SDI INIT occurs. 

NOTE 

T65 does not check for out-of-range values. Do not exceed the maximum specified input 
values. Also, the last cylinder parameter must always be equal to or greater than the 
first cylinder parameter. If an invalid cylinder value is entered, a (servo) seek failed 
error (F5) occurs. 
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Drive-Resident Diagnostics and Utilities 4-25 
(T66) Variable Average Seek Timing Test 



(T66) Variable Average Seek Timing Test 

This test executes the average seek timing algorithm and allows you to time any length seek. To 
set the seek length, modify the first (FCY) and last (LCY) cylinder addresses through the seek 
parameter input utility (T65). 

The run time for this test varies, depending on the length of the seek used. The run time should 
not take longer than 45 seconds, regardless of the length of the seek. 

The following message is scrolled across the OCP display: 

[AVG TIM PWD=xx.xMS] 
[AVG TIM REV=xx.xMS] 

where xx.x is the seek time (in milliseconds). The variable average seek time is defined as the 
average (AVG) of 512 seeks in forward (FWD) and reverse (REV) directions. 
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5 

Troubleshooting and Error Codes 



5.1 Troubleshooting Reference Materia! 

When running diagnostics and interpreting error logs, you will need the documents listed 
(alphabetically) in Table 5-1. 

Table 5-1 Reference Material for Troubleshooting 

Document Title Order Number 

DSA Error Log Manual EK-DSAEL-MN 

DSA Error Log Pocket Service Guide EK-DSAElr-PG 

Getting Started With VAXsimPLUS AA-KN78A-TE 

HSC Service Manual EK-HSCMA-SV 

VAXsimPLUS Field Service Manual AA-KN82A-RE 

VAXsimPLUS User Guide AA-KN8QA.-TE 

Refer to Section 5.19 for RA90/RA92 disk drive error codes and descriptions. 

5=1=1 Customer Support Training for the RA90/RA92 Disk Drive 

You must have the proper training to efficiently support the RA disk family. This training is 
available at most Customer Services Training Centers, category A and B sites. Consult with your 
Customer Services unit managers for training information. 

DSA Level I and HSC Level I courses are prerequisites to the RA90 IVIS training. 

Although support organizations are available to assist in problem solving, there is no substitute for 
proper training. Support training resources include DSA Level II and DSA Troubleshooting courses, 
and the RA90 Disk Drive Technical Description Manual. 

5.2 RA90/RA92 Troubleshooting Aids 

The following aids are available for disk drive troubleshooting: 

• VAXsimPLUS (VMS systems) (see Section 5.2.1) 

• Host error logs (see Section 5.2.2) 

• Drive internal error log (see Section 5.2.4) 

• Operator control panel (OCP) fault indicator/error codes (see Section 5J2.5) 

• Drive power supply indicator (see Section 5.2.6) 
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5-2 Troubleshooting and Error Codes 

• Drive error reporting mechanisms (see Section 5.2.7) 

• Host-level diagnostics/utilities (see Section 5.2.8) 

5.2.1 VAXsimPLUS 

The VAX System Integrity Monitor (VAXsimPLUS) provides access to VMS error log data. The 
three VAXsimPLUS manuals needed to operate VAXsimPLUS effectively are listed in Section 5.1. 

5.2.2 Host Error Logs 

Refer to the appropriate system error logs for error interpretation. The DSA Error Log Manual and 
the DSA Error Log Pocket Service Guide contain system error log descriptions for most operating 
systems. 

5.2.3 Extended Status Bytes 

Extended status bytes are part of the response to the SDI GET STATUS/TOPOLOGY command or 
any unsuccessful response to a level 2 command. These bytes are passed through the controller 
to the host for error logging purposes. Figure 5—1 shows a breakdown of the RA90/RA92 extended 
drive status bytes. Extended status bytes are described in detail in the sections that follow. 



BYTE 01 


RESPONSE OPCODE 


BYTE 02 


UNIT NUMBER LOW BYTE 


BYTE 03 


SUBUNIT MASK 


BYTE 04 


REQUEST BYTE 


BYTE 05 


MODE BYTE 


BYTE 06 


ERROR BYTE 


BYTE 07 


CONTROLLER BYTE 


BYTE 08 


RETRY COUNT 


BYTE 09 


PREVIOUS CMD OPCODE 


BYTE 10 


HDA REVISION BITS 


BYTE 1 1 


CYLINDER ADDR (LO) 


BYTE 12 


CYLINDER ADDR (HI) 


BYTE 13 


RECOVERY LVL 


GROUP NO 


BYTE 14 


ERROR CODE 


BYTE 15 


MFG FAULT CODE 



GENERIC DRIVE STATUS BYTE 
GENERIC DRIVE STATUS BYTE 
GENERIC DRIVE STATUS BYTE 
GENERIC DRIVE STATUS BYTE 

EXTENDED DRIVE STATUS BYTE 
EXTENDED DRIVE STATUS BYTE 
EXTENDED DRIVE STATUS BYTE 
EXTENDED DRIVE STATUS BYTE 
EXTENDED DRIVE STATUS BYTE 
EXTENDED DRIVE STATUS BYTE 
EXTENDED DRIVE STATUS BYTE 

CXO-2157B 



Figure 5-1 RA90/RA92 Extended Drive Status Bytes 
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5.2.3.1 Response Opcode (Byte 1) 

Response Opcode (Byte 1) is the drive-to-controller response opcode and indicates the success or 
failure of the previous controller-to-drive command. Generally, this is transparent to the user. 

5.2.3.2 Unit Number Low Byte (Byte 2) and Subunit Mask (Byte 3) 



BYTE 3 



1 X X X X 



BYTE 2 






xxxxxxxx 



DRIVE UNiT SELECT NUMBER (0 TO 4094 DI 

SUBUNIT MASK (SUBUNIT REPORTING THIS STATUS) 

•SUBUNIT 1 MASK (NOT USED) 

•SUBUNIT 2 MASK (NOT USED) 

■SUBUNIT 3 MASK (NOT USED) 



5.2.3.3 Request Byte (Byte 4) 



CXO-3017A 



X X X X 



X X X X 



U 



BYTE 4 



REQUEST BYTE 



(RU) = RUN/STOP SWITCH OUT 
1 = RUN/STOP SWITCH IN 

(PS) = PORT SWITCH OUT 
1 = PORT SWITCH IN 

(PB) = PORT A RECEIVERS ENABLED 
1 = PORT B RECEIVERS ENABLED 



(EL) = NO LOGGABLE INFORMATION IN EXTENDED STATUS AREA 
1 * LOGGABLE INFORMATION IN EXTENDED STATUS AREA 

(SR) = SPINDLE NOT READY (NOT UP TO SPEED) 
1 = SPINDLE READY 

(DR) . NO DIAGNOSTIC IS BEING REQUESTED FROM THE HOST 
1 = THERE IS A REQUEST FOR A DIAGNOSTIC TO BE 

LOADED INTO THE DRIVE MICROPROCESSOR MEMORY 

(RR) = DRIVE REQUIRES NO RECALIBRATE COMMAND 
1 = DRIVES REQUESTS RECALIBRATE COMMAND 

(OA) = DRIVE ON LINE OR AVAILABLE TO CURRENT CONTROLLER 
1 = DRIVE UNAVAILABLE (IT IS ALREADY ON LINE TO ANOTHER 
CONTROLLER) 

CXO-1281A 
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5-4 Troubleshooting and Error Codes 



5.2.3.4 Mode Byte (Byte 5) 



X 



X X X X 



BYTE 5 



MODE BYTE 



L 



(S7) « 512-BYTE SECTOR FORMAT (16-BIT) 
1 = 576-BYTE SECTOR FORMAT (18-BIT) 

(NO CURRENT PLAN TO IMPLEMENT 18-BIT) 

(DB) = DBN AREA ACCESS DISABLED 
1 = DBN AREA ACCESS ENABLED 

(FO) = FORMATTING OPERATIONS DISABLED 
1 = FORMATTING OPERATIONS ENABLED 

(DD) = DRIVE ENABLED BY CONTROLLER ERROR ROUTINE 
OR DIAGNOSTIC 
1 = DRIVE DISABLED BY CONTROLLER ERROR ROUTINE 
OR DIAGNOSTIC (FAULT LIGHT . ON) 

(W1) = WRITE-PROTECT SWITCH FOR SUBUNIT IS OUT 
1 = WRITE-PROTECT SWITCH FOR SUBUNIT IS IN 

(W2) NOT IMPLEMETED 

(ED1) ERROR LOG DISABLE (SET BY TWO-BOARD CONTROLLER DIAGNOSTICS) 

(EDO) ERROR LOG DISABLE (SET BY TWO-BOARD CONTROLLER DIAGNOSTICS) 

CXO-2193A 



Bits EDI and EDO can only be set by two-board controller diagnostics. If either EDI or EDO are set 
(EDx=l), the RA90/RA92 disk drive turns off internal error logging. 

5.2.3.5 Error Byte (Byte 6) 



X X X 



X | BYTE 6 



ERROR BYTE 



(WE) « NO ERROR 

1 . WRITE LOCK ERROR (ATTEMPT TO WRITE WHILE 
WRITE-PROTECTED) 

NOT USED 

(DF) = NO ERROR 

1 = DRIVE FAILURE DURING INIT 

(PE) = NO ERROR 

1 = LEVEL 2 PROTOCOL ERROR (IMPROPER COMMAND 
CODES OR PARAMETERS ISSUED TO DRIVE) 

(RE) = NO ERROR 

1 = SDI RECEIVE ERROR ON SDI TRANSMISSION 
LINE(S) FROM CONTROLLER 

(DE) = NO ERROR 

1 . DRIVE ERROR (DRIVE FAULT LIGHT MAY BE ON; 
CAN BE CLEARED VIA DRIVE CLEAR COMMAND) 



CXO-1283C 
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The error byte is one of four generic status bytes. Error bits in the error byte are set by the drive 
for drive-detected errors. The controller clears the bits with the SDI DRIVE CLEAR command. The 
bits are described as follows: 

• The DE bit reports any internal drive error that requires explicit controller recovery action 
other than simple command retransmission or context readjustment. 

• The RE bit reports transmission errors detected by the drive. 

• The PE bit reports level 2 protocol errors detected by the drive. 

• The DF bit indicates the drive did not pass its initialization/diagnostics the last time it was 
initialized or powered up. 

• The WE bit reports the drive received a SELECT TRACK AND WRITE command or a FORMAT 
command while the drive was write protected. 

NOTE 

Drive-detected errors fit into one of the five classes described above and are reported as 

such. 

Controller-detected drive errors are logged without any of these bits being set. For 
example, the drive actuator has positioned itself to a cylinder other than the one the 
controller requested, The controller detects tins failure as a drive positioner error or an 
invalid header error. 

5.2.3.6 Controller Byte (Byte 7) 



0000XXXX BYTE 7 



CONTROLLER BYTE 



0000 = NORMAL DRIVE OPERATION 

1000 . DRIVE IS OFF LINE AND UNDER CONTROL 

OF A DIAGNOSTIC 

1001 = DRiVE iS OFF LiNE DUE TO ANOTHER DRiVE 

HAVING THE SAME UNIT SELECT IDENTIFIER 

(51) 1 = NOT USED 

(52) 1 = NOT USED 

(53) 1 = NOT USED 

(54) 1 = NOT USED 

CXO-2158A 



5.2.3.7 Retry Count (Byte 8) 

Byte 8 is the retry count during the last SEEK or RECALIBRATION command. (The retry count is 
the number of times the command was retried, internal to the drive, in an attempt to successfully 
complete the SEEK or RECALIBRATE operation.) 
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5.2.3.8 Previous Command Opcode (Byte 9) 



X X X X 



X X X X ] BYTE 9 LAST OPCODE 



(EXTENDED DRIVE STATUS BYTE) 



■OPCODE OF THE LAST PREVIOUS LEVEL 2 DRIVE COMMAND 
DECODED BY THE DRIVE (RECEIVED FROM THE SDI CONTROLLER) 

81 = CHANGE MODE 

82 = CHANGE CONTROLLER FLAGS 
03 = DIAGNOSE 

84 = DISCONNECT (DRIVE) 

05 = DRIVE CLEAR 

06 = ERROR RECOVERY 

87 = GET COMMON CHARACTERISTICS 

88 = GET SUBUNIT CHARACTERISTICS 
0A = INITIATE SEEK 

8B = ON LINE 

0C = RUN 

8D = READ MEMORY 

8E = RECALIBRATE 

90 = TOPOLOGY 

OF = WRITE MEMORY 

FF = SELECT GROUP (LEVEL 1 COMMAND - PROCESSED BY FIRMWARE 
SEEK HEAD SELECT SUBROUTINES) 

CXO-1285B 



5.2.3.9 HDA Revision Bits (Byte 10) 

Byte 10, bits and 1, indicate which vendor heads are used in the HDA. Bit 7 is the 
UNCALIBRATED bit and indicates the drive failed during drive recalibration. 

5.2.3.10 Cylinder Address (Bytes 11 and 12) 

Decoding bytes 11 and 12 gives you the cylinder address from the last SDI SEEK command issued 
to the drive. See Examples 5-1 (for the RA90 disk drive) and 5-2 (for the RA92 disk drive) to 
determine cylinder address and group (head). 
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The RA90 implements the following geometry for logical 
addressing: 






xne ka^u nas j. xogxcax uracK == j. pnys: 

The RA.90 has 1 logical group = 1 logical track 

The RA90 has logical cylinder =13 logical groups 

The current cylinder address and current group bytes indicate the 
cylinder address and group where the read/write heads are 
positioned. The following formula outlines how to obtain 
the cylinder head from the logical block number (LBN) . 

Cylinder (cyl) = LBN/897 = cyl. fraction (discard fraction) 

Head ^ (LBN - (cyl * S97))/69 = head. fraction (discard fraction) 

LBN to physical cylinder and head number 
conversion : 

If LBN = 23609 

Then 23609/897 = 26.32 (discard fraction) 

CYL = 26 

To find the head, use the following example: 

Head = (23609 - (26 * 897))/69 

Head = 4.16 (discard fraction) 

Head = 4 

As you can see LBN 23609 = head 4 and physical cylinder 26. 

DBNs to physical cylinder and track (head on 
RA90 disk drives) conversion: 

CYL = 2654 + DBN/910 - cylinder .fraction (discard fraction) 

Head = (DBN - ((CYL - 2654) * 910))/70 = head. fraction (discard 
fraction) 

XBN to physical cylinder and head conversion: 

CYL = 2651 + XBN/910 - cylinder .fraction (discard fraction) 

Head = (XBN - ((CYL - 2651) * 910))/70 - Head. fraction (discard 
fraction) 

RBN to convert a RBN to the associated physical 
cylinder and head, use the following formula: 

CYL = RBN/13 = cylinder. fraction (discard fraction) 

Head = RBN - (CYL * 13) 



Example 5-1 RA90 Cylinder Address and Group (Head) 
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The RA92 implements the following geometry for logical 
addressing: 

The RA92 has 1 logical track = 1 physical track 

The RA92 has 1 logical group = 1 logical track 

The RA92 has logical cylinder =13 logical groups 

The current cylinder address and current group bytes indicate the 
cylinder address and group where the read/write heads are 
positioned. The following formula outlines how to obtain the 
cylinder head from the logical block number (LBN) . 

Cylinder (cyl) = LBN/949 = cyl. fraction (discard fraction) 

Head = (LBN - (cyl * 949))/73 = head. fraction (discard fraction) 

LBN to physical cylinder and head number 
conversion : 

If LBN = 23609 

Then 23609/949 = 24.88 (discard fraction) 

CYL = 24 

To find the head, use the following example: 

Head = (23609 - (24 * 949))/73 

Head = 11.411 (discard fraction) 

Head =11 

As you can see LBN 23609 = head 11 and physical cylinder 24. 

DBMs to physical cylinder and track (head on 
RA90 disk drives) conversion: 

CYL = 3104 + DBN/962 = cylinder. fraction (discard fraction) 

Head = (DBN - ((CYL - 3104) * 962))/74 = head. fraction (discard 
fraction) 

XBN to physical cylinder and head conversion: 

CYL = 3101 + XBN/962 = cylinder .fraction (discard fraction) 

Head = (XBN - ((CYL - 3101) * 962))/74 = Head. fraction (discard 
fraction) 

RBN to convert a RBN to the associated physical 
cylinder and head, use the following formula: 

CYL = RBN/ 13 = cylinder. fraction (discard fraction) 

Head = RBN - (CYL * 13) 



Example 5-2 RA92 Cylinder Address and Group (Head) 
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5.2.3.11 Error Recovery Level (Selected Group) (Byte 13) 



Ix|x(x|x|x|x|x|x| BYTE 13 ERROR RECOVERY LEVEL (SELECTED GROUP) 



GROUP NUMBER FOR LAST GROUP SELECT 
COMMAND, OR LAST SUCCESSFUL GROUP 
SELECT DURING A SEEK COMMAND 
(R/W HEAD NUMBER) 



CURRENT ERROR RECOVERY LEVEL 

CXG-215SA 



5.2.3.12 Error Code (Byte 14) 

Refer to Section 5.19 for drive error codes and their descriptions. 

5.2.3.13 Manufacturing Fault Code (Byte 15) 

Byte 15 contains the manufacturing repair code and is used by the repair depot. 

5,2,4 Drive internal Error Log 

All drive-detected disk subsystem errors are recorded in the RA90/RA92 drive internal error log. 
Power-related errors are also recorded. ECC errors are not recorded in the drive internal error log. 

Figure 5-2 shows the RA90/RA92 drive internal error log memory layout; Figure 5-3 shows the 
RA90/RA92 drive internal error log header format; and Figure 5-4 shows the RA90/RA92 drive 
internal error log descriptor format. 

There are three ways to extract the RA90/RA92 drive internal error log: 

1. Run DKUTIL from the HSC console or KDM controller (see Section 5.2.4.1). 

2. Run utilities for two-board controllers. (Table 5-2 lists the systems that use two-board 
controllers.) 

3. Run drive-resident utility T41 from the RA90/RA92 OCP (see Section 5.2.4.2). 
Table 5-2 Two-Board Controller Diagnostics 

Monitor KDA/KDB/UDA 

XXDP ZUDM 

VDS EVRLL 

MDM Test drive internal error log utility at the device utility menu 
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LABEL BYTE WIDE MEMORY 



LOGBUF 
0A006H 



SAVESET 
0A010H 

SAVEO 
0A018H 



0A025H 



0A026H 



0A02FH 



DSCBEG 
0A030H 



0A42FH 



DSCEND 
0A030H 



START OF ERROR LOG HEADER 



START OF POWER DOWN 
PAGE; FIRST 8 BYTES ARE 
DRIVE GENERIC 



SECOND 8 BYTES 
ARE DRIVE SPECIFIC 



LAST BYTE OF HEADER 



UNUSED 



UNUSED 



START OF ERROR LOG DESCRIPTORS 



LAST BYTE OF LAST DESCRIPTOR 



END OF DESCRIPTOR MARKER; 
FROM HERE ON EEPROM IS NOT 
USED FOR ERROR LOG 



CXO-2162A 



Figure 5-2 RA90/RA92 Drive Internal Error Log Memory Layout 
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LOGBUF (ADDRESS LABEL) = 0AO06H 



WORD 00 


FFFB 


WORD 01 


SIZE 


DEVICE TYPE 


WORD 02 


ERRORLOG SIZE 


WORD 03 


LO ORDER SEEKS SINCE LAST POWERUP 


WORD 04 


HI ORDER SEEKS SINCE LAST POWERUP 


v 


n 




vT 



SAVESET (ADDRESS LABEL) = 0A010H 



WORD 05 


WORD 06 


WORD 07 


WORD 08 


WORD 09 


WORD 10 


WORD 1 1 


WORD 12 


WORD 13 


WORD 14 


WORD 15 



.0 ORDER CUMULATIVE SEEKS 



HI ORDER CUMULATIVE SEEKS 



LO ORDER TOTAL ELAPSED TIME (MIN) 



Hi ORDER TOTAL ELAPSED TIME (MIN) 



OCP SWITCH STATUS 



UNIT NUMBER TENS DIGIT 



UNIT NUMBER 1000 DIGIT 



UNIT NUMBER ONES DIGIT 



UNIT NUMBER 100S DIGIT 



S.SA2 STATUS BYTE 



CUMULATIVE NUMBER OF SPINUPS 



NOT USED = 0000H 



BAD ERROR LOG FLAG 



FAULT TABLE POINTER 



POINTER TO DESCRIPTOR ENTRY THAT FAILED 



^ 



V POWER DOWN DATA* 



J 



'MUST BE SAVED AT AN EEPROM PAGE BOUNDRY (XXX0H). 



Figure 5-3 RA90/RA92 Drive Internal Error Log Header Format 



CXO-2160A 
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5-12 Troubleshooting and Error Codes 



DSCBEG (ADDRESS LABEL) = 0A020H 



WORD 00 


ERROR TYPE 


ERROR CODE 


J 
J 




WORD 01 


FRU/DRU NUMBER 


NUMBER OF ASCII BYTES 




WORD 02 


LO NUMBERS SEEKS AT TIME OF ERROR 


I DRIVE GENERIC 
( INFORMATION 


WORD 03 


HI NUMBER OF SEEKS AT TIME OF ERROR 




WORD 04 


ENTRY WRITE COUNT 




WORD 05 


NUMBER OF SPINUPS SINCE FIRST POWERUP 




WORD 06 


ERR RCVRY LVL 


CURR GROUP 


TBD 




WORD 07 


DESIRED CYLINDER 




WORD 08 


LO ORDER TOTAL ELAPSED TIME (MIN) 




WORD 09 


HI ORDER TOTAL ELAPSED TIME (MIN) 




WORD 10 


ASCII BYTE 


ASCII BYTE 


I DRIVE SPECIFIC 
f INFORMATION 


WORD 1 1 


ASCII BYTE 


ASCII BYTE 




WORD 12 


ASCII BYTE 


ASCII BYTE 




WORD 13 


ASCII BYTE 


ASCII BYTE 




WORD 14 


ASCII BYTE 


ASCII BYTE 




WORD 15 


ASCII BYTE 


ASCII BYTE 





CXO-2161A 



Figure 5-4 RA90/RA92 Drive Internal Error Log Descriptor Format 

5.2.4.1 Running DKUTIL From the HSC Console or KDM70 Controller 

Running DKUTIL from the HSC console controller dumps the drive internal error log to the HSC 
console. The same capability exists for the KDM70 controller. 

To display the drive internal error log, enter the DISPLAY ERROR command at the HSC prompt 
(see the example below). 

First do: 

DKOTIL> GET Dacaoac (If Drive is capable of being put on line) 
OR 
DKDTIL> GET Daocaoc/NOONLINE (If Drive is incapable of being put on line) 

THEN 

DKUTIL> DISPLAY ERROR 

Figure 5-5 shows an example of a formatted drive internal error log. The data in this example will 
help you determine the time elapsed since a failure occurred. 
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ERROR LOG ENTRIES FOR DRIVE D090 



SELECT STARTING ENTRY LOCATION (1-32) [20]? 

ENTER HOW MANY ERROR LOG ENTRIES TO DISPLAY (0-32) [32]? 

PAUSE AND PROMPT AFTER 10 ERROR LOG ENTRIES [(Y),NJ? Y 



DRIVE 
TYPE 

RA90 



MAX#ENTRYS 
(D) 

64 



SEEKS/POWER ON 
(D) 

328 



CUM. SEEKS 
(D) 

9065 



CUMULATIVE 
POWER-ON MINUTES 
(D) (H) 



0000042695 



0000A6C7* 



ENTRY 


ENTRY 


ERR 


ERR 


SEEK 


MFG 


LOCTN 


COUNT 


TYP 


CODE 


COUNT 


CODE 


(D) 


(D) 


(A) 


(H) 


(D) 


<H) 


20 


4 


PE 


2B 


8751 


0D 


19 


4 


DE 


F5 


8751 


11 


18 


4 


RE 


07 


8731 


0E 



(D) = decimal 
(A) = ASCII 
(H) = hex 



(SUBTRACT) 
/EQUALS* - 



DRIVE SPECIFIC HEX DATA 
BYTE 0-9, RIGHT TO LEFT 
(H) 



00 00 3F 1C 00 00 00 00 00 17 
00 00 3F 1C 00 00 00 00 00 17 
00 00 3E 95 05 2C 06 2C 00 15 



DRIVE ERR 

MESSAGE 

(A) 

inv.dmr.num. 

tisp.sek.fii. 

frm.seq.err. 



t t t t 



M N/A 



TIME" 



CYL 



ERROR— 1 

REC 

LEVEL 



U 



SPIN-UPS 
SINCE FIRST 
POWER-UP 



>— HEAD/ 
GROUP 



* 0000A6C7 (H) CUMULATIVE POWER-ON MINUTES 
" 00003F1C (H) LEFT-MOST FOUR TIME" BYTES 



000067AB (H) TIME LAPSE SINCE LAST ERROR 
(D) = 26,539 MINUTES 

CONVERT HEX TIME LAPSE TO DECIMAL MINUTES, THEN 
CONVERT TO HOURS, THEN 
CONVERT TO DAYS. 
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Figure 5-5 Drive Internal Error Log 

The ten bytes of drive-specific hex data printed by the DKUTIL utility are divided by the 
RA90/RA92 into five data fields. The drive specific hex data fields are: 

1. Time (minutes) 

2. Cylinder 

3. Head/group 

4. Undefined 

5. Spinups since the last power-up 

NOTE 

All five data fields represent the drive state at the time of the error. 
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5.2.4.2 Running the Drive-Resident Utility Dump (T41) From the OCP 

Run drive-resident utility T41 to display the drive internal error log. (Refer to Chapter 4 for 
instructions on how to run this utility.) The drive internal error log is displayed starting with the 
latest entry and continuing until all entries are displayed. Positions three and two represent the 
error log entry in decimal. Positions one and zero represent the two-digit LED hex error code. Each 
entry is displayed for 1.5 seconds. You can start or stop the display using the Run switch. 

5.2.5 OCP Fault Indicator/Error Codes 

The OCP Fault indicator lights when a hard fault is detected. Select the Fault switch to display 
an error code. These error codes are described in Section 5.19. Each description includes fault 
isolation information. 

5.2.6 Drive Power Supply Indicator 

The drive power supply has a green LED that, when lit, indicates the power supply is operating 
normally. If the LED is not lit and the drive is experiencing problems, begin troubleshooting in this 
area. Figure 5-6 shows the location of the green LED. 



DRIVE 
REAR 



°o°o o° 








fo%°°0o 
oWo 0n 0o o 

\-ynyOrL u 



QUARTER-TURN 
FASTENER (4) 



Figure 5-6 Power Supply Indicators 



GREEN LED 
(POWER OK) 




CIRCUIT 
BREAKER 
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If the green LED appears to be at about half brilliance and the OCP has no display, the power 
supply is in a crow-bar state. Recycling the circuit breaker may clear the condition. 

5.2.7 Drive Errcr Reporting Mechanisms 

The RA90/RA92 detects and reports the majority of real-time errors and faults in the drive, 
including intermittent failures. 

All drive-detected errors are reported to the controller. If error logging is available and enabled, the 
controller reports errors to the host. 

5.2.7.1 Detailed Description of Error Reporting Mechanisms 

RA90/RA92 disk drives have five mechanisms available to report error conditions to the controller. 
The mechanism used is based on the state of the drive, the drive activity at the time of the error, 
and the error that occurred. The five mechanisms are listed below. As described in this list, it is 
assumed that a port or ports have been selected from the OCP port select switches. 

1. STOP TRANSMITTING CLOCKS AND DATA OVER ANY SDI ONE— The drive stops 
transmitting clocks and data over any SDI line connected to either port if any of the following 
conditions exist: 

• The drive is off line to the controller. 

• Power is failing. 

• A failure is detected that prevents communication between the drive and the controller. 

2. TRANSMIT CLOCKS BUT NO STATE INFORMATION— The drive transmits drive clock 
but does not transmit state (RTDS) information if it is off line to the controller or if it failed 
resident diagnostics. The only time a drive executes resident diagnostics is at power-up or reset 
and when an SDI INIT is received by the real time controller state (RTCS) line. If a drive 
receives an SDI INIT, it executes resident diagnostics verifying processor and communications 
paths to the controller. 

3. ASSERT ATTENTION IN THE RTDS— The drive uses the RTDS attention mechanism to 
report error conditions if the drive is on line to the controller. The RTDS attention mechanism 
is used when the command timer expires or when one of the generic status bits changes, with 
the following exceptions: when a generic status bit changes as a result of a correct operation 
during an SDI level 2 command or an error in an SDI level 2 command occurs. 

4. SEND UNSUCCESSFUL RESPONSE— An unsuccessful response to an SDI level 2 command 
is sent to the controller if any of the following conditions exist: 

• The execution of an SDI level 2 command could not be completed successfully. (For example, 
a level 2 DRIVE CLEAR command was issued but the error condition could not be cleared.) 

> • A transmission error occurred during an SDI level 1 exchange and the drive successfully 
received a valid SDI level 1 end frame. 

• A protocol error occurred. 

• A fault occurred while the drive was executing a topology command. 

5. CONTROLLER RESPONSE TIMEOUT —This is not a drive mechanism, but it indicates to 
the controller that the drive has an error condition, 
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5.2.8 Host-Level Diagnostics and Utilities 

If possible, avoid running host-level diagnostics to recreate the symptoms. You only extend the 
service period. However, under certain conditions you may need to run host-level diagnostics. Refer 
to Section 5.11. 

Do not use host-level diagnostics to verify drive repair; use resident diagnostics tests. Use system- 
level commands to ensure the drive is on line and operating normally. 

5.3 General Troubleshooting Information 

The drive internal error log records all drive-detected (DD) faults as error codes. Use the recorded 
error codes to help isolate faults to a failing or failed FRU. Run the RA90/RA92 disk drive utility 
program T41, Display Drive Error Log, to extract drive internal error log information. 

Real-time faults detected by the disk subsystem are recorded in the host error log of the supporting 
operating system software. Host error logs contain detailed information on intermittent and hard 
drive errors and can also be used to isolate the failing field replaceable unit (FRU). 

ECC-type errors are detected by controllers and logged in the host (or HSC) level error logs. These 
errors are not recorded in the drive internal error log. The drive only reports drive-detected errors. 

Once a disk drive fault has been isolated to an FRU and repairs have been made, use drive-resident 
diagnostics to verify proper drive operation. 

5.3.1 Drive-Resident Diagnostics Limitations 

The following disk functions or areas are not covered by resident diagnostic testing: 

1. Customer data areas (are never read or written to during testing). 

2. Data paths between the drive and controller. 

3. Internal loopback testing (only tests the SDI loopback through the TSID gate array). External 
SDI testing can be accomplished with resident diagnostic T09 and use of a loopback connector 
(Digital part number 70-19074-01). 

"At-speed" testing of the SDI circuitry is not done. SDI interface testing is accomplished by 
internally looping the SDI signals within the SDI gate array and TSID. Transformer couplings 
are not tested. 

If you suspect media, go to Section 5.8. 

Drive-resident diagnostics descriptions are in Chapter 4. 

5.4 Step-by-Step Troubleshooting Procedure 

Use this troubleshooting procedure when you are reasonably certain the problem is in a disk drive. 
Some troubleshooting procedures may require that you follow the entire procedure before isolating 
the problem. If you have an error code, go to Section 5.19 for a description of the error and an FRU 
replacement list. 

Included in this section is a step-by-step troubleshooting flowchart (Figure 5—7). Each section 
heading that follows this flowchart contains a number, enclosed within a box, that corresponds to 
those in the step-by-step troubleshooting flowchart. 
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Figure 5-7 (Cont) Step-by-Step Troubleshooting Flowchart 
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Figure 5-7 (Cont.) Step-by-Step Troubleshooting Flowchart 
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DSA 

CODHDe 



DDDE = DRIVE-DETECTED DRIVE ERROR 
DDDF = DRIVE-DETECTED DIAGNOSTIC FAULT 
DDPE « DRIVE-DETECTED PROTOCOL ERROR 
RE = TRANSMISSION ERROR 



SEE SECTION 
ON "ERROR 
CODES AND 
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Figure 5-7 (Cont.) Step-by-Step Troubleshooting Flowchart 



DIGITAL INTERNAL USE ONLY 



5-20 Troubleshooting and Error Codes 



3.7-3.15 
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PROBLEMS 
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4.4 










TO SINGLE ^ 
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HDA 
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Figure 5-7 (Cont.) Step-by-Step Troubleshooting Flowchart 
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5.4.1 Troubleshooting Worksheet 

Develop a worksheet to aid in collecting error data. Identify only those errors being reported 
against the identified drive. Arrange a piece of wide, line-printer paper with columns identified as 
bllows: 

MSCP Status/Event Code 

Comment Area 

Block Number 

Block Type (LBN or RBN) 

Cylinder 

drtmn 

e 

Sector 

Drive LED Error 

Drive-Reported Previous/Current Group 

Date/Time of Error 

5.5 Identifying the Problem Drive § 

The cause of local drive error problems generally requires minimum analysis. These problems 
can be identified by noting that the drive is not performing basic operational functions (power-up, 
spinup, spindown, and so on), by incorrect lamp indications, or by OCP error codes. 

Once you have isolated the problem drive, proceed to Section 5.6. 

If you have not isolated the problem drive, refer to Sections 5.5.1 through 5.5.6. These sections 
describe procedures to use for problem drive identification. 

5.5.1 Talking to the System Operator/Checking the OCP Fault Indicator [U] 

TYic/»iicc! Arntrtx orrnrc wifli ftfco cnrc+pTn fkYuai*afrvi*/morftacrAti* or>/1 nqprei OrvM*a f/y**e *"»• ncovci ^an rmrvtri/lA 

valuable information concerning system activity at the time of the error (such as applications that 
were running, disks the data is stored on, affected users, and impact on other applications). 

Check the OCP for fault indications. 

5.5.2 Using VAXsimPLUS to Identify the Problem Drive [g 

Use VAXsimPLUS to obtain a summary of information that may lead to direct identification of the 
failing drive. Section 5.1 lists appropriate VAXsimPLUS documentation. 

If the problem drive is identified using information obtained with VAXsimPLUS, go directly to 
Section 5.6. 

5.5.3 Using the Host Error Log to Identify the Problem Drive Q3| 

Study available host error logs. Host error logs provide failing drive and error code information. 
Use this information to identify failing FRUs. 

Refer to the DSA Error Log Manual for detailed descriptions of most system-level host error logs. 
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5.5.4 Using the HSC Console Log to Identify the Problem Drive 

Drives attached to HSC controllers send drive state information to the HSC console log. Use the 
HSC console log to identify problem drives. Correlate time-of-error information to user operations. 

5.5.5 Using the Host Console/User Terminal Trails to Identify the Problem Drive 

HI 

If no host error log or VAXsimPLUS resource is available, check host console trails or user terminal 
trails. These may indicate drive problems and identify the problem drive. 

5.5.6 Using Other Means to Identify the Problem Drive HH 

If no hard fault indications, error logs, or console logs are available to identify the problem drive, 
refer to Section 5.9. 

It is important to identify the failing disk drive before attempting to isolate the failing subsystem 
component. If more than one drive exhibits the same failure symptoms, examine the possibility of a 
controller or system problem. 

NOTE 

Using DSA utilities such as Error Log Dumper (ZUDM/EVKLL/MDM/DKUTTL) to dump 
the RA90/RA92 drive internal error log may identify problem hardware areas. However, 
there may be a significant negative impact on the availability of hardware and data to 
the customer. Consider off-line diagnostics only as a last resort. 

DSA utilities (Bad Block Replacement or HSC Verify) verify that the logical structures of the user 
data are correct. Additionally, these utilities check the status of any revectored blocks, blocks with 
forced error flags set, blocks marked bad in the RCT area, the number of primary and non-primary 
replaced blocks, and blocks that exceed symbol error thresholds. User data areas that have flagged 
forced error conditions are identified as disk areas that cannot be accessed due to media or drive 
problems. 

Transient problems may require the use of off-line diagnostics. EVRL, ZUD, and MDM frequently 
miss a problem executing in the DBN area of a disk. You may have to exercise the customer data 
area of the disk to increase the chances of generating an error. 

CAUTION 

Back up customer data before executing diagnostics on customer data areas of the disk. 

Refer to Section 5.11 for host-level diagnostics information. 

5.6 Identifying the Problem FRU g 

After identifying the problem drive, you must identify the failing FRU. The following sections 
describe procedures to use for identifying the problem FRU. 

Use the host error log or HSC console log to fill in the troubleshooting worksheet (described in 
Section 5.4.1). Calculate the logical cylinder, group, and sector from the targeted LBN or RBN 
and add that information to the worksheet. Drive-reported errors (SDI error packet) include valid 
extended drive status bytes that call out the logical cylinder, the previous and current group select, 
and the master drive error code. 

After the data is collected, analyze the data to select the most logical replacement FRU. Proceed to 
Section 5,7 and compare the collected data to determine troubleshooting priority. 
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5.6.1 Pre-Verifying Drive Symptoms |Tj 

After identifying the drive, you should verify drive failure symptoms by performing pre-verification 
testing of the drive. Pre-verification of drive symptoms using resident diagnostics has the following 
benefits: 

• Establishes a basis for post-verification and: 

- Ensures that no new problems have been introduced. 

— Ensures that a replaced FRU corrected the problems detected during pre-verification 
testing. 

• Establishes a more reliable error code or condition to troubleshoot. Generally, errors detected 
while performing drive-resident diagnostics have a higher priority than errors or symptoms 
derived from any source previously mentioned. 

To complete pre-verification testing, perform the following steps: 

1. Spin up the drive. 

2. Execute resident diagnostic test T60 (loop-on-test utility). 

3. Execute resident diagnostic test TOO (sequence test). 

Examine the drive internal error log and note the type of errors. Compare the generated errors to 
the error symptoms originally encountered. The following sections help isolate the failure symptoms 
to the failing FRU. 

5.6.2 Using OCP Error Codes to Identify the Problem FRU g| 

Correlate error codes displayed in the OCP, host error logs, or drive internal error logs to error 
descriptions given in Section 5.19. Each error description includes a list of suggested replacement 
FRUs. Use this list to repair the drive. Verify repairs using the post-verification procedures defined 
in Section 5.13.2. 

5.6.3 Using VAXsimPLUS to Identify the Problem FRU H 

VAXsimPLUS identifies FRU replacements based upon an analysis of the errors being recorded by 
the VMS error logging system. VAXsimPLUS identifies the failing FRU through a theory number. 

The procedure for cross-referencing theory numbers to drive FRUs is determined by individual 
Digital service areas. Each service area has the responsibility of defining and implementing 
VAXsimPLUS in line with individual area service goals and strategies. 

If VAXsimPLUS identifies a failing FRU, replace the FRU then proceed with post-verification 
testing. Refer to Chapter 6 for FRU removal and replacement procedures. 

5.6.4 Using the Host Error Log to Identify the Problem FRU |3 

If the system does not support host error logs, or if a host error log cannot be obtained, go to 
Section 5.6.5 

If you are working in a cluster environment, it may be easier to use the HSC console log. The HSC 
console log is a condensed version of the host error log. Proceed to Section 5.6.5 for information on 
using the HSC console log. 

The following is a data collection step: 

Access the host error log. Obtain the drive and controller event (error) codes. Note the LBNs 
involved in read/write disk transfer errors. 



DIGITAL INTERNAL USE ONLY 



5-26 Troubleshooting and Error Codes 



Note the LBN being reported in the data transfer error packet. Also note if any of the following 
error types have been detected by the controller: 

• Data errors 

• ECC errors 

• Uncorrectable ECC errors 

• Header-not-found errors 

• Invalid header errors 

• Header compare errors 

• Format errors 

• Data sync timeout errors 

Study the SDI error packet of the error log for drive-detected errors and check for the following 
information: 

• Error code 

• Drive group number 

• Logical cylinder number 

For controller-detected (communication) errors, such as protocol or transmission errors, note the 
controller-reported error code in the status/event code field. 

5.6.5 Using the HSC Console Log to Identify the Problem FRU m 

If the disk drive is not attached to an HSC or KDM and no supporting error data is available, go to 
Section 5.6.6. 

The amount of subsystem error information reported by the HSC console log depends upon the HSC 
error threshold level setting. The HSC SETSHO utility can be set to alter the error threshold level 
as follows: 

• Information 

• Warning 

• Error 

• Fatal 

Execute the HSC SHO SYSTEM command to display the error threshold parameter setting. If the 
error threshold is set sufficiently high (fatal), no error information may be available from the HSC 
console log. Refer to Section 5.6.6 to continue error analysis. 

If the drive is attached to an HSC, check the HSC console log. Use the HSC Service Manual 
to decode the console error log. Obtain status/event codes, drive extended status bytes for the 
drive LED error codes, and the LBN addresses at the time of the error. Organize the gathered 
information on the troubleshooting worksheet to help isolate the failing FRU. Proceed to Section 5.7 
and compare the collected data to determine troubleshooting priority. 

If the information from the HSC console log does not identify the problem FRU, go to Section 5.6.4 
to examine the host error log, or Section 5.6.6 to examine the drive internal error log. 
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5.6.6 Using the Drive Internal Error Log to Identify the Problem FRU H 

If the drive is connected to a cluster, it is strongly recommended that you dump the drive internal 
error log before troubleshooting or attempting FRU replacement- 
lb extract the RA90/RA92 drive internal error log, use one of the following methods: 

• Run DKUTIL from the HSC console or KDM controller (see Section 5.2.4.1). 

• Run drive-resident utility T41 from the RA90/RA92 OCP (see Section 5.2.45). 

• As a last resort, run utilities such as MDM, EVRLL, or ZUDMxx. 

NOTE 

Off-line diagnostics remove system availability from the user and should only be used as 

« last resort* 

Media problems such as ECC errors are not logged in the drive internal error log. 
Proceed to Section 5.8 for media errors. 

If you cannot access the drive internal error log, verify the physical connection between the drive 
and the controller. If the drive is attached to an HSC, type a SHOW DISK command at the HSC 
console to verify that the drives are on line to the controller. 

If no errors have been logged, or the drive internal error log is inaccessible, proceed to Section 5.9. 

If a host error log or an HSC console trail has been acquired, proceed to Section 5.9. 

5.7 Priority Order of Troubleshooting DSA Errors § 

The priority order of troubleshooting DSA errors is important. The following sections describe the 
importance of each error type and DSA reporting mechanisms. 

5.7.1 Drive-Detected Drive Errors and Diagnostic Faults gg 

Give error codes in this category top priority. 

Drive-detected drive errors (DDDEs) appear in host error logs and HSC console logs provided the 
error threshold is set low enough. DDDEs are also available in the drive internal error log. 

Drive-detected diagnostic faults (DDDF) appear in the drive internal error log, although they may 
be seen at the host level. This error type is top priority. 

5.7.1.1 Drive-Detected Protocol Errors Without Communication Errors |fj| 

The occurrence of drive-detected protocol errors (such as errors 07, 0C, and so on) without the 
occurrence of transmission errors (errors 20, 21) indicate a controller problem or an electronic 
control module (ECM) failure. Troubleshooting must be done on that basis. 

The occurrence of drive-detected transmission errors with error codes 08, 09, 0D, 0E, OF, 10, 16, 19, 
LA, 29, 2A, 2B, 2E, or 2F without communication errors generally indicate a controller problem. 
The drive detects these errors by analyzing packet frames as they are being received. If the drive 
is at fault (in other words, replacing the controller did not fix the problem), replace the drive ECM 
module. 

5.7.1 .2 Drive-Detected Pulse or State Parity Errors |]3J 

The occurrence of transient, drive-detected communication errors occasionally causes a protocol 
error. This is generally a manifestation of communications problems. Determine if the problems 
occur on the transmit or receive lines from the controller to the drive. Drive error codes associated 
with pulse or parity errors are 0A, 20, or 21. 
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If the drive is seeing drive-detected communication errors, then the drive ECM receive circuitry, 
SDI port transmit circuitry (controller), or SDI cabling is suspect. Reconfiguration might further 
isolate the problem (use different drive/controller ports and cable combinations). 

If the controller is seeing communication errors (these also show up as ECC errors) and the drive 
is also seeing communication errors, then the whole path (drive to controller) is suspect. It is 
important to make a distinction between the communication errors and ECC errors. If an SDI 
subsystem is having communication errors, one of the manifestations (not the cause) is ECC errors. 
If the communication errors are severe enough, data transfers are halted. 

NOTE 

Fix communication problems before concentrating on ECC or positioner errors. 

Ensure SDI cable connections are secure enough to provide proper electrical and 
mechanical continuity. 

5.7.2 Controller-Detected EDC Error gg 

NOTE 

EDC errors are not caused by drives. 

EDC is a data protection mechanism to ensure data integrity within a disk controller. In contrast, 
the ECC mechanism ensures data integrity from the controller through the drive, to the media, and 
back again. ECC ensures integrity of customer data and the EDC mechanism together. 

It is important to note the differences in how controllers implement the EDC mechanism: 

• For the KDA/KDB/UDA family of controllers, EDC is generated on a sector of data at the bus 
interface as the data is initially read from host memory. EDC is verified on a sector basis as 
the data is written to host memory from the controller memory. Therefore, xDA/xDB controllers 
generate and check EDC. The microcode engine of the controller performs this check at the bus 
interface. 

• For HSC controllers, EDC is generated on a sector of data at the KLpli port processor module 
as the data streams in from host memory over the CI bus. EDC then becomes an integral part 
of the user data as the data is transferred to the HSC data memory. As this data is read out 
of HSC data memory by the K.sdi modules and transmitted to the drive, user data EDC is 
regenerated and checked in the KLsdi and compared to the EDC characters appended to the 
data by the K.pli. 

The EDC must check OK, or the write-transfer-to-disk will be aborted. The HSC again requests 
the data from host memory and again queues the write-transfer-to-disk when data becomes 
available in the HSC data memory. If the EDC checks OK at the K.sdi on a write-to-disk, the 
EDC and ECC codes are appended to the data stream and written to disk with ECC ensuring 
data integrity of the customer data and the EDC code. 

For a disk read, the data, as it is read by the K.sdi (over the SDI read/response line), is checked 
for good ECC, then the data plus EDC characters are stored in HSC data memory. As the data 
is sent to host memory, the KLpli, while transferring the data to host memory, verifies that good 
EDC exists for the customer data block but does not transfer EDC characters to host memory. 
If EDC is bad, the K.pli informs the HSC functional code to again request the same data from 
the disk. 

• For KDM controllers, EDC is generated on a sector of data at the bus interface as the data is 
initially read from host memory. EDC is verified on a sector basis at the SDI SERDES port 
interface as the data is written to disk. On a read, EDC is checked by the SDI SERDES at the 
completion of each sector read (and data correction, if applicable). EDC is checked again as the 
data is written to host memory from the controller memory. 
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If EDC errors are detected, the problem is a controller problem. The ECC is protecting the data 
to and from the disk and checking the integrity of the data at the SDI port module logic. 

NOTE 

A properly functioning controller always reads bad EDC written to disks. However, if 

bad EDC is written to a disk (improperly functioning controller), each time the block 

with bad EDC is read, EDC errors are logged against the drive. Only after the data is 

restored or rewritten to the disk with good EDC by a good controller will the errors go 

away. 

5.7.2.1 Controller-Detected Protocol and Transmission Errors Without Communication Errors 
(Status/Event Codes 14B or 4B) |fj| 

The troubleshooting process for this type of error is very similar to the discussion in Section 5.7.1.1. 
It is important to determine that the controller detected protocol errors without basic 
communications errors such as: 

• Protocol errors — A level 2 response from the drive had correct framing codes and checksum 
but was not a valid response under SDI protocol rules. If the opcode on the read/response line 
has an odd number of bits, it is an unknown opcode; if the response packet is bad, it is also 
classified as a protocol error. 

• Transmission errors — The controller detected an invalid framing code or a checksum error in 
a level 2 response from the drive. The UDA50 also returns the same status/event code for 
controller-detected protocol errors. 

Tabie 5-3 Summary of Controller-Detected Communication Errors 









Status/Event Code 




Controller-Detected 
Communication Errors 


HSC 


UDA 


EDA 


KDB 


RDM 


Protocol 

Invalid frame code, 
level 2 checksum 

Pulse/state pari^ (wire) 


14B 
4B 

10B 


4B 
4B 

10B 


14B 
4B 

10B 


14B 
4B 

10B 


14B 
4B 

10B 



Communication (wire) errors are described in Section 5.7.2.2. 

5.7.2.2 Controller-Detected Pulse or State Parity Errors (Status/Event Code 10B) ||j| 

The procedure for handling controller-detected communication errors is very similar to the one 
described in Section 5.7.1.2. The controller detected a pulse error on the state or data line, or the 
controller detected a parity error in a state frame from the drive. This error is associated with the 
controller and drive SDI port electronics (including interconnecting cables). 

The symptoms indicate a basic (wire) communications problem within the SDI pathway, including 
drive or controller port electronics. Noise can be injected through the port electronics or the cabling 
between the controller and the drive. Additionally, bad cables (bent, walked on) or loose connecting 
hardware (bulkhead connections) can contribute to the problem. 

Pulse errors are caused by two consecutive pulses of the same polarity. SDI signal lines use an NRZ 
transmission technique where no two adjacent pulses can be of the same polarity. This is detected 
on either the state or read/response line. 

A state parity error is the occurrence of bad parity over the length of a single SDI RTDS state 
frame or SDI read/response frame. This type of error may also result in the detection of ECC errors 
during data transfer times. This occurs when the read/response line and the write/command line 
are functioning as the data line. 
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Controller-detected transmission errors (4B) occur if an invalid framing code or a checksum error is 
detected during a level 2 response from the drive. 

NOTE 

The UDA50 also returns this status/event code for controller-detected protocol errors. 

5.7.3 Controller-Detected Communication Events and Faults gj 

Controller-detected communication events include: 

• Loss of read/write ready— MSCP Status/Event 8B 

• Loss of receiver ready — MSCP Status/Event CB 

• Receiver ready collisions — MSCP Status/Event 1AB 

• Drive clock dropout^-MSCP Status/Event AB 

• Failure of drive initialization process — MSCP Status/Event 16B 

• Failure of drive to respond to controller-requested initialization — MSCP Status/Event 18B 

• SERDES overrun error (in controller)— MSCP Status/Event 2A 

• SDI drive command time-out— MSCP Status/Event 2B 

Communication systems have faults and event irregularities. Communication faults are events, but 
not all events are faults. The difference is related to timing between events and system operations 
occurring at the time of the event. 

For example, a loss of read/write ready is an event if no write activity is occurring at the time of 
the loss. During a write, however, a loss of read/write ready is an error (fault) event. 

5.7.3.1 Controller-Detected: LOSS OF READ/WRITE READY (Status/Event Code: 8B) §H 
The controller event is LOST READ/WRITE READY DURING OR BETWEEN TRANSFERS. 
This error indicates read/write ready (RTDS status bit) was negated when R/W ready had been 
previously asserted (indicating completion of a preceding seek) and: 

• The controller attempted to initiate a transfer, or 

• A R/W ready was found negated at the completion of a transfer 

This event usually results from a drive-detected transfer error, in which case an additional error log 
message may be generated containing the drive-detected error event code. 

This error may be symptomatic of a fine track servo problem in the RA90/RA92 disk drive. If there 
are no other such subsequent error log entries, the loss of fine track was probably responsible for 
the loss of read/write ready. Examine the drive internal error log for evidence of servo problems. 

5.7.3.2 Controller-Detected: LOST RECOVER READY (Status/Event Code: CB) §g 

RECEIVER READY (RTDS status bit) was negated when the controller attempted to initiate a 
transfer, or RECEIVER READY was not asserted at the completion of a transfer. This includes all 
cases of the controller timeout expiring for a transfer operation (level 1 real-time command). 

As a consequence of this condition, the controller performs an SDI INIT then attempts to request 
a GET STATUS. The extended status error log entry returned in the GET STATUS command may 
indicate what the problem is. 

If no information is being reported by the drive as a part of the error log sequence, approach 
the problem as a drive ECM failure. Examine the drive internal error log for extended error 
information. 
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5.7.3.3 Controller-Detected: RECEIVER READY COLLISION (Status/Event Code: 1AB)J3loT 

The controller attempted to assert RECEIVER READY (RTCS status bit), indicating it was ready 
to receive a drive response. The drive RECEIVER READY (RTDS status bit) was still asserted, 
indicating it was ready to receive a command from the controller. 

This is not an error, but an event within the subsystem. All DSA drives and controllers occasionally 
log this event. There is no performance impact because of the occasional occurrence of this event. 
No data corruption is associated with the occurrence of this event if no other SDI bus errors occur 
at the same time. 

Acceptable event rates for RECEIVER READY collisions are less than ten per day, provided the 
following events are not contributing: 

• Broken physical SDI interconnects (plugging and unplugging SDI cables). 

• Controller (node) initializations or HSG failovers. 

NOTE 

The occurrence of RECEIVER READY collisions happens primarily when both Ports A 

and B are enabled at the drive. 

Resolve unacceptable event rates of more than ten a day by replacing either the ECM or controller 
port interface module, cables, or bulkheads. 

5.7.3.4 Controller-Detected: DRIVE CLOCK DROPOUT (Status/Event Code: AB) gji] 

Either data (read/response line) or state clock (RTDS) was missing when it should have been 
present. This is usually detected through a timeout. 

A fatal drive condition can cause the drive to drop the drive clocks. The drive should reassert 
clocks after performing a drive initialization and establishing clocks to the controller to re-establish 
communications and state information between the drive and controller. The sequence of getting 
status and error information then occurs. Analysis of error log message packets usually indicates 
that the above sequence has occurred. 

If such message packets are not being processed or received, it is possible that the condition cannot 
be detected by the drive. Execute drive SDI loopback tests to try to find subtle SDI problems. The 
order of emphasis is: 

• ECM 

• Controller port module 

• Cabling (including bulkhead connectors) 



5.7.3.5 Controller-Detected: DRIVE FAILED INITIALIZATION (Status/Event Code: 16B)jal2| 

The drive clock failed to resume following a controller-attempted drive initialization. This implies 
the drive encountered a fatal initialization error. It may also indicate the drive was attempting its 
own initialization or that the drive is looping in an initialization state or routine. 

5.7.3.6 Controller-Detected: DRIVE IGNORED INITIALIZATION (Status/Event Code: 18B)|3l3[ 
The drive clock continued running even though the controller attempted to perform a drive 
initialization. This implies the drive did not recognize the ENIT command from the controller. 
It may also indicate the drive was performing an initialization caused by some drive-detected 
condition and, in the course of initialization, ignored the controller's attempt to initialize the drive. 
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5.7.3.7 Controller-Detected: SERDES OVERRUN ERROR (Status/Event Code: 2A)|3l4| 

SERDES overrun (or underrun) errors indicate that the drive is too fast for the controller or, more 
typically, a controller hardware fault is preventing the controller microcode from keeping up with 
data transfers to or from the drive. 

Because of the speed with which the RA90/RA92 disk drive handles data transfers, some SDI 
controller ports may not be able to keep up with data transfers to and from the drive. This speed 
sensitivity may even show up on drive ports that have successfully run other RA-type disk drives. 

There is not a universal problem with Digital SDI controller port boards. The controller port boards 
design supports RA90/RA92 operating speeds. 

The SERDES overrun problem manifests itself as transient occurrences of the error or as solid 
SERDES problems preventing execution of read/write operations to the drive. For all controllers, 
the SERDES occurrence looks like a single controller port failure and is seldom related to a 
particular drive port. 



5.7.3.8 SDI Drive Command Timeout (Status/Event Code: 2B) |3.15[ 

A controller may report an SDI command timeout when it issues a command to the drive and 
the drive does not respond within the required timeout period. The timeout period is command- 
dependent. 

SDI command timeouts are associated with Status/Event Code 2B. These events will frequently 
occur under the following conditions: 

• Powering up a drive with one or both port switches depressed, then hitting the Run switch. 

• Spinning down a drive with one or both port switches depressed. 

Under these two conditions, the SDI command timeout event reports can be ignored. However, 
under other conditions, you should examine SDI command timeout events by looking at the logged 
errors around the time of the event. The drive internal error log may also reveal clues to the 
problem; however, you should verify that the time of the error, as logged in the drive, corresponds 
to the time of the event. 

If the controller is an HSC, verify that the device priority is correctly managed. The RA90/RA92 
disk drive's place in the priority scheme is as follows: 

TA90 — highest priority 

RA90/RA92 

ESE2x 

RA82 

RA81 

RA70 

RA80 — lowest priority 

5.8 Media-Related Errors | 

Media and read/write transfer problems manifest themselves in many ways. Symptoms include: 

• ECC errors (refer to Section 5.16) 

• Uncorrectable ECC errors (refer to Section 5.16.1) 

• Header-not-found errors (refer to Section 5.16.1) 

• Invalid header errors (refer to Section 5.16.1) 

• Header compare errors (refer to Section 5.16.1) 

• Format errors 

• Data sync timeout errors 
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Read/write errors may involve the read/write data path or defective media. For the SDI disk 
subsystem, the read/write data path includes: 

• SDI controller read/write data path circuits 

• SDI cables and bulkhead connectors 

• Disk drive read/write data path hardware 

• Disk drive media 

Use the following process to analyze read/write transfer errors: 

1. Isolate the LBNs associated with the logged transfer errors in the host or HSC error log. If 
there are many, randomly select 10 to 20. Use the appropriate algorithm to decode targeted 
LBN numbers to the logical cylinder, group, and head. Refer to Example 5-1 for RA90 LBN 

_: > .J T7„,«^.^l« C O <U- DAQOT "OXT 

cuuvcnm/u, ouu ra-aixLLipio *j— » iw iwwa j_u-m^ 
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2. Decode the LBNs in question to physical cylinders, tracks, and groups (physical read/write 
heads). 

5.8.1 Repeating LBNs/RBNs gg 

LBNs or RBNs that consistently recur in the host error log should be replaced. If the controller or 
system has not marked these for replacement, replace them manually by running HSC DKUTIL, 
EVRLK, or ZUDLx, and MDM. This is a useful procedure for blocks that consistently report ECC or 
data errors. 

This symptom occurs when the host bad block replacement (BBR) software does not use customer 
data as a pattern to test the suspect block. The block is initially nagged for replacement. The host 
executes a test of the block and finds nothing wrong. It does not revector the block, but instead 
restores the original data back to the block. The user then attempts to access the data and may get 
another ECC error severe enough to invoke the BBR activity again. 

5.8.2 Excessive Number of Blocks Replaced Because of R/W Path Problems [fl 

Read/write data path problems may cause the replacement of a high number of good blocks. This 
may lead to logical fragmentation of the disk. If this happens, the number of blocks in the RCT 
recorded as revectored differs substantially from FCT information. For example, the RCT may show 
a doubling of replaced blocks occurring over a short period of time. Use EVRLB, MDM, ZUDKxx, or 
HSC FORMAT to reformat the disk and recover these good blocks. 

NOTE 

Back up customer data before executing the reformat. 

Use the host error log to identify replacement blocks and to show if BBR activity is complete. Use 
HSC DKUTIL to dump the factory scan (FCT) and RCT areas of the disk. Look for differences 
in the FCT and what is currently in the RCT. The contents of RCT only show what blocks were 
replaced; the host error log and HSC console logs supply the time of replacement. 

Keep good records in the site management/cluster guide. Include results of VERIFY and BBR scans 
of each disk. This information helps identify changes in block replacement activity and is part of 
good site management practices. 

5.8.3 LBN Correlation to Single Group/Track g3) 

Consistent failures involving one or two read/write heads usually indicate an HDA failure. 
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5.8.4 LBN Correlation to Head Groups £3 

Consistent failures within head groups are usually due to head selection logic within the HDA. The 
groups are as follows: 



RA90(LA) 
70-22951-01 
HDA Rev 00 



RA90(SA) 
70-27268-01 
HDA Rev 01 



RA92 

70-27492-01 
HDA Rev 10 



0-3 
4-7 
8-11 
12 



0-2 
3-6 
7-9 
10-12 



0-2 
3-6 
7-9 
10-12 



Replace in the following order: 

1. PCM 

2. HDA 

5.8.4.1 LBNs Correlated to Zone Write Boundaries ISIsj 

Failures showing no consistency to a group or head may show consistency in write current zones. 
DSA drives divide the media into different write current amplitude zones. The RA90/RA92 divides 
the media into four write current amplitude zones as listed in Table 5—4. 

Table 5-4 RA90/RA92 Write Zones 







RA90 




RA92 


Zone 


Cylinder 
Range 


LBN 


Cylinder 
Range 


LBN 





0000-1722 


0-1546428 


0000-2014 


0-1912234 


1 


1723-2020 


1546429-1813724 


2015-2363 


1912235-2243435 


2 


2021-2335 


1813725-2096289 


2364-2731 


2243436-2592667 


3 


2336-2660 


2096290-2377747 


2732-3112 


2592668-2954237 



To verify this correlation, you need a substantial number of errors (greater than 100) and knowledge 
of the user disk space being used. A customer using more than 50 percent of the available disk 
space is probably accessing all zones of the disk. A disk using less than 25 percent of the disk space 
may only be accessing a single zone. Knowledge of operating system utilization of disk space is 
necessary to make this troubleshooting procedure effective. 

Zone-related problems encountered with the RA90/RA92 disk drives generally are resolved by 
replacing the PCM, ECM, or HDA (in that order). 

5.8.4.2 LBN Correlation to a Physical Cylinder £§ 

Failures consistently related to a specific cylinder may be the result of a head touchdown. Problems 
involving servo detection information (dedicated and/or embedded) that prevent head tracking to 
cylinders usually indicate media corruption. These problems include HDA and ECM electronics. 
Failures are usually due to specific cylinders in a head crash and may include an area as wide as 
ten cylinders. One to three cylinders usually indicate servo data failures. 

In the RA90/RA92, logical cylinders correlate to physical cylinders. 
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5.8.5 Multiple Controllers Report Same Error Types gg 

If multiple controllers report the same error types and only one drive port (after cable swap) reports 
the error, it is likely an ECM problem. 

If multiple controllers report the same error types and both drive ports report the same error, 
replace drive components in the following order: 

1. PCM 

2. ECM 

3. SDI cabling/interconnects 

4. Power source 

5. Spindle ground brush 

6. HDA 

5.8.6 Only Single Controller Port Affected [33| 

If errors occur to a single controller port and both drive ports have been tested to a known good 
controller interface, then the problem is in the controller or cable. 

5.8.7 Isolating Random R/W Transfer Errors g|| 

NOTE 

You are here only because the disk drive is experiencing random read/write transfer 
errors or because your checklist has led you here. If you have not pinpointed the failure, 
see Section 5.9. 

Random physical cylinder and head failures are generally caused by ECM/SDI/SDI-controller 
interface problems. A faulty spindle ground mechanism or a power supply exceeding noise 
specifications may also cause a drive to exhibit random errors. 

Intermittent read/write problems involving random read/write heads and cylinders may be the 
result of intermittent failures through the read/write data path. This includes SDI cabling or 
read/write data path hardware in the controller. 

5.8.7.1 Not Defined to a Specific Drive/Controller Port 

This is a decision point for the first-time call effort with random read/write errors. If working from 
a miscellaneous check or action item list, proceed to Section 5.9. 

For the RA90/RA92 drive, replace parts in the following order: 

1. PCM 

2. ECM 

3. Cabling (reconfigure) 

4. Power supply 

5. Spindle ground brush 

6. HDA 
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5.9 Miscellaneous Checks § 

Miscellaneous checks are provided as an alternative when: 

• No host error log is available. 

• No HSC console trail is available. 

• No errors are logged in the drive internal error log. 

• Errors are transient or not reproducible through standalone diagnostics. 

If you cannot access the RA90/RA92 drive internal error log from the OCP, replace FRUs in the 
following order: 

1. ECM 

2. OCP 

3. Power supply 

If you cannot access the RA90/RA92 drive internal error log with DKU1TL or EVRLL/ZUDM/MDM, 
perform the following: 

1. Execute resident diagnostic test TOO (drive spun down). 

2. Execute resident diagnostic test TOO (drive spun up). 

3. Execute external loopback SDI test T09 (use loopback connector Digital part number 
70-19074-01). 

4. Check drive power supply and indicators. See Section 5.2.6 for the location of power supply 
indicators and their meanings. 

5. Check drive power supply for proper voltages and ripple (noise). See Chapter 1 for power supply 
operating specifications. 

6. Check spindle ground brush for excessive wear. 

7. Check the SDI cable by changing the cable. 

8. Check the controller port by connecting the SDI cable to another port. 

Unreliable power from the power supply, controller, or source power may cause the drive to exhibit 
a variety of unrelated errors. Ensure source power is within tolerances and follow suggested drive 
power checks. 

If all checks have been made and no problem is found, replace the ECM. The ECM is the most 
likely FRU to fail, provided the failing drive has been correctly identified. 

Use the Customer Support Center for problems beyond the scope of your experience or this manual. 

NOTE 

For transient disk subsystem errors, running host-level diagnostics on xDA/xDB 

controllers seldom isolates errors without long run times. This seriously impacts system 

availability to the customer. Use system-level and drive internal error logs whenever 

possible. 

5.10 Are You Lost? g 

If you feel that the problem is beyond your capabilities and you have spent too much time trying 
to isolate it, use available support resources. Digital Customer Services should operate within the 
Management Action Planning (MAP) guidelines for each respective area of the country/world. 

If you are in the process of performing action items, complete those items and reenter the drive 
fault evaluation phase after collecting new error data. 
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5.1 1 Using Host-Level Diagnostics as a Last Resort @ 

There are significant concerns about running standalone diagnostics in troubleshooting RA90/RA92 
disk problems. Running standalone diagnostics extends site time and makes the system 
unavailable to the customer. Customer Services goals are to m a x i miz e system or device availability 
to the customer and minimize repair time. Consider running host-level diagnostics only if you have 
exhausted all options. Tables 5—5 through 5—7 contain the names of diagnostics that are compatible 
with the RA90/RA92 disk drives. 

CAUTION 

Back up customer data before executing diagnostics on customer data areas of the disk. 

Protection of customer data is your responsibility. 

Follow the strategy which is in place to provide quick and accurate diagnosis, repair, and validation. 
This strategy minimizes the impact on system or device availabilitv 

5.11.1 HSC-Based Diagnostics 

Use HSC utilities (DKUTIL) and diagnostics (ILEXER and ILDISK) in a cluster environment. 
Though the diagnostics are in line and do not cause a loss of system availability, device availability 
is an issue. With that in mind, examine the drive internal error log prior to running standalone 
diagnostics. 

To execute the in-line tests or utilities, the drive must first be dismounted. The rest of the disk 
subsystem will not be affected. DKUTIL, ILEXER, and ILDISK do not adversely affect the drive; 
however, ensure customer data is protected. While running these tests, give errors detected by the 
drive or controller top priority. 

5.11.2 KDM-Based Diagnostics 

Use KDM utilities (DKUTIL) and diagnostics (ILEXER and ILDEVO) in a cluster environment. 
Though the diagnostics are in line and do not cause a loss of system availability, device availability 
is an issue. With that in mind, examine the drive internal error log before running standalone 
diagnostics. 

To execute the in-line tests or utilities, the drive must first be dismounted. The rest of the disk 
subsystem will not be affected, DKUTIL, ILEXER, and ILDEVO do not adversely affect the drive; 
however, ensure customer data is protected. While running these tests, give errors detected by the 
drive or controller top priority. 

5.11.2.1 On Line from VMS 

Use the following procedure to access and run on-line programs on a KDM controller. See 

Section 5.11.2.2 for instructions on accessing and running programs in standalone mode. 

NOTE 

You cannot run on-line diagnostics, exercisers, and utilities without first running 

EVRLN.KDM. Follow the procedure shown here. 

$ RUN SYS$SYSTEM:SYSGBN 

SYSGEK> CONNECT PYAO/NOADAPTER 

SYSGEN> EXIT 

$ SET DEFAULT SYS$MAXNTENANCE 

$ SET HOST/DUP/SERVBR=:DUP/LOAD*EVRLN.KDM POAO/DEVICE 

$ SET HOST/DUP/SERVER«DUP/TASK»ILDEVO POAO/DEVICE 
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5.11.2.2 Running Standalone Programs from the VAX Diagnostic Supervisor 

DS> ATTACH KDM70 HUB DUx N BR 

I I 

| | BUS REQUEST 

I 

| NODE NUMBER 

DS> SELECT DUx 

DS> RUN EVRLN 

EVRLN> RUNL ILDEVO 

5.11.3 xDA Controller-Based Diagnostics 

To run standalone diagnostics or utilities (excluding EVRAE) through any UDA, KDA, KDB 
controller, the operating system must be shut down and the appropriate diagnostic/supervisor 
loaded. 

Some diagnostics force error conditions to validate the drive's ability to detect error conditions. 
Error conditions detected by the drives are logged to the drive internal error log as a normal course 
of operation. Therefore, through several iterations of a standalone diagnostic, the drive internal 
error log may be overwritten and the real drive-detected errors lost. 

For example, running a single iteration MDM on a MicroYAX may result in 13 error events. These 
events are logged to the drive's internal error log (EEPROM) and may overwrite important error 
information. 

With that in mind, examine the drive internal error log before running standalone diagnostics. 

A recent SDI specification change addresses this issue by having the controller disable drive 
error logging during drive testing. The following diagnostic software releases incorporate the SDI 
specification changes: 

• XXDP— Release 135 (Q3FY88) 

- ZUDG rev CO 

- ZUDH rev CO 

• MDM— Release 122 (Q3FY88) 

- NAKDAH 

• VDS— Release 31 (Q4FY88) 

- EVRLF version 8.3 

- EVRLG version 8.3 

If any errors occur while running disk diagnostics, go to Section 5.6. 

If multiple errors occur, go to Section 5.13.1. 

If no errors occur, go to Section 5.10 and call remote support. 
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Table 5-5 VDS-Based Off-LIne Diagnostics 



Diagnostic 



Tttb 



iUVKJuB 


imve iormatcer 


EVRLF 


Tests 1-3 


EVRLG 


Test 4 


EVRLJ 


Tests 


EVRLK 


Bad block replacement utility (Scrubber) 


EVRLL 


Drive-resident error log utility 


EVRAE 


MSCP disk exerciser 


EVSBA 


YAXautosizer 



Table 5-6 MDM-Based Off-Une Diagnostics 



Diagnostic 



Title 



MDM 



MicroVAX diagnostic supervisor 1 



1 Ci3TFeatly has a problem identifying drive unit number. 



Table 5-7 XXDP-Based Off-Une Diagnostics 



Diagnostic 



Title 



ZUDH 2 



ZUDI 2 

ZUDJ 

ZUDK 

ZUDL 

ZUDM 



Tests 1-3 

Test 1: UNIBUS interrupt/address test 

Test 2: Executes drive-resident diagnostics 

Test 3: Disk function test (rdVwrt) 

Test 4: Disk exerciser 

Test 5: UDA/KDA50 subsystem exerciser 

Formatter 

Bad block replacement utility 

Disk-resident error log utility 



Forbes errors during run that are logged in the drive internal error log. 



5.1 2 Exiting Data Collection: Action Kern List Process | 

Your goal during the data collection phase is to collect logged subsystem events including: 
d Status/event codes from error log packets 

* Drive-detected master error codes 

* Identified target LBN numbers 



DIGITAL INTERNAL USE ONLY 



5-40 Troubleshooting and Error Codes 



When no host or HSC error log information is available, use the drive internal error log or 
operator/system console trail to identify the problem drive. In some isolated cases (less than 
one percent), you will have to use a troubleshooting worksheet (described in Section 5.4.1) in place 
of system logged information. You should leave this phase ready to analyze collected data or with 
an action item list. 

5.1 3 FRU Replacement g 

Replace an FRU only after: 

• Analysis of VAXsimPLUS directed a replacement FRU based upon its analysis of occurring 
errors or error rates. 

• Analysis of host error logs resulted in a list of error codes with particular emphasis placed on 
identifying drive-detected error codes. The error codes should predominately be drive error 
codes. In some circumstances, error codes are generated by the controller. 

• Analysis of the HSC console log resulted in a list of drive error codes used in identifying 
replacement FRUs. 

• Analysis of the drive internal error log led to an identification of a replacement FRU. 

• Analysis of miscellaneous checks or the process of elimination identified an FRU replacement. 

Once an error code has been established from one of the previously mentioned sources, refer to 
Section 5.19 for error code descriptions and suggested FRU replacements). 

5.13.1 Multiple Error Codes H3 

If a number of different error codes are detected, consider the following to decide which error code(s) 
to use for troubleshooting: 

• Give error codes obtained from running internal drive diagnostics top priority. 

• Select an error code or symptom that indicates the least number of FRUs. Drive-detected errors 
of this type will have been derived using the least amount of circuitry to isolate the particular 
failure. 

• Select the error code that occurs most often. 

• Select the FRU that is most commonly indicated by different error codes. 

• Select the FRU that most commonly indicates the same manufacturing code (Section 5.2.3.13). 

5.13.2 Service Post- Verification |§2 



After replacing an FRU or repairing a drive, execute drive-resident diagnostics. You can do this 
through power-up and spin-up cycles or by using tests which exercise the repaired FRUs. Compare 
the results to the diagnostics executed during pre-verification testing (Section 5.6.1). 

Post-verification testing accomplishes the following: 

• Verifies that no new problems have been introduced when servicing or replacing FRUs. 

• Verifies that a repair or replaced part corrected any problems detected during pre-verification 
testing. 

If the same error code(s) occur during post-verification testing, reinstall the original FRU. Continue 
troubleshooting procedures, or replace the next identified FRU in the appropriate list. 
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If the diagnostics pass successfully, the problem has most likely been resolved, with the following 
exception: 

• If the original error codes used for FRU isolation were the result of host, controller, or drive 
internal error log entries (not duplicated by running pre-verification testing), the problem may 
be due to an intermittent failure. Proceed to Section 5.13.3 

If any errors occur, you may want to reinstall the original drive FRU and go to Section 5.6.1. 

5.13.3 Return Disk Drive to User m 

After checkout is complete, return the disk drive to the user. Have the user exercise the repaired 
disk drive through customer applications. If customer applications appear to be functioning 
normally, the call can be closed. 

If the drive fails, return to Section 5.6 or call remote support. 

If there is a question as to the correct identity of the failing disk drive, return to Section 5.5. 

5.1 4 Performance Issues When No Errors Are Being Logged 

Customer complaints of disk performance can require a fair amount of analysis. Often the 
performance complaints are quite subjective. The following list of questions may help analyze 
performance complaints: 

1. Do the performance issues relate to all or most of the disks? 

If so, ensure that system parameters comply with suggested guidelines. Cluster size of disks, 
working set size parameters, paging parameters, and ACP/XQP-related parameters all can 
affect performance. 

2. Do the performance problems occur during image activation (when a large sized application 
program is initially started)? 

Many layered products require some time to fully activate. This is not a disk problem. 

3. Is the performance problem noticed by users of the same image, layered product, or file on the 
(same) disk? 

If the disk is attached to a local controller (UDA/KDA/KDB) but is a VAX node member in a 
cluster, then request that the file/image/layered software product be moved to a disk on the 
HSC. Local serving of disks creates bus, VAX, and I/O overhead that impacts performance. 

4. Is the performance problem noticed by users of a file/image/layered product that resides on the 
same disk as the swap and page files? 

If so, request the system manager monitor paging and swapping activity. High page/swap 
rates decrease VMS response and create an I/O bottleneck for the page/swap disk. Request the 
file/image/layered product be moved to another disk. 

In addition to system parameter settings, two areas of the architecture (hardware-related) 
contribute to actual loss of performance. These include: 

1. Nonprimary replacements in a critical file or directory structure, such as the following 
examples: 

• Nonprimary replacement in VMS disk: [000000] INDEXF.SYS 

• Nonprimary replacement in a frequently used directory file 

The two examples are of files that may affect the perceived performance of a disk. However, the 
location of a block of data within a file and how the operating system is set up equally affect 
nonprimary replacement which, in turn, impacts system or disk drive performance. 
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A nonprimary replaced block in the INDEXF.SYS file of a disk could be very significant if it is 
in the front of the file. However, if it is the last block within the file, it might not have as large 
an impact on system performance. 

A nonprimary replacement in a block within SYS.EXE that is loaded once by VMS into memory 
(at startup) and stays resident in memory has no effect on performance. However, if the block is 
within a portion of SYS.EXE that is frequently brought in by VMS, it could impact performance. 
A solution is to increase the VMS working set size. 

A nonprimary replaced block within a swap or paging file has little performance impact. If 
the system is doing enough paging and swapping to notice the occurrence of nonprimary 
replacements, the real problem may be with the user or system working set size. Performance 
may improve if the system manager adjusts system parameters around paging and swapping. 

VMS uses virtual block file structures, not logical blocks. VBNs do not correlate to LBNs. lb 
correlate an LBN to the affected file, contact someone familiar with the operating system file 
structure, such as VMS ODS-2. Identifying affected files within ODS-2 is very complicated. 

2. Difficulty (but success) in achieving fine track following a seek. 

The RA90/RA92 disk drive utilities T36, T38, and T39 measure various seek time parameters. 
Compare measured times to drive specifications in cases where seek time is in question. 

Temperature can affect the performance of T36 and T38. 

5.1 5 Troubleshooting VMS Mount Verification 

EXE$MOUNTVER is the VMS executable mount verification process to bring disks back on line 
after a problem has made the drives inaccessible to a host VAX. It is a very complicated process. 
If any failure to reinitialize the disk occurs, or if EXE$MOUNTVER exceeds its allowed timeout 
period (default 10 minutes), the host logs a mount verification error to the host error log. 

5.15.1 VMS Mount Verification 

The mount- verification feature of Files- 11 disk handling generally leaves users unaware that a 
mounted disk has gone off line and returned on line (or in some other way has been unreachable 
and then restored). Mount verification is the default parameter for EXE$MOUNTVER, with the 
following exceptions: Disks mounted /FOREIGN and disks mounted /NOMOUNTVERIFICATION 
do not undergo mount verification except during cluster state transitions. 

Drives dual-ported through HSC controllers should never be mounted /NOMOUNTVERIFICATION 
because this may prevent VMS from failing the drive over to the secondary HSC controller. 

EXE$MOUNTVER sends status messages to OPCOM. Because there are cases when mount 
verification messages are needed at the operator console and OPCOM might not be able to provide 
them, mount verification also sends special messages with the prefix %SYSTEM-I-MOUNTVER to 
the operator console, OPA0. 

5.15.2 VMS Problems Surrounding Diagnosis of "Why a Drive Mount- Verifies" 

VMS calls EXE$MOUNTVER if a drive loses contact with the system. (For example, the controller 
sends a command to the drive but does not get a successful response back within the controller- 
specific timeout period.) The process verifies that the disk VMS reestablished contact with is the 
same disk originally connected. 

Sending the drive to the mount-verify state involves: 

1. The host initiating an MSCP ONLINE command to the drive modifier followed by a GET UNIT 
STATUS (GUS). 
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2. The host reading the home block and comparing the volume information (serial number, name, 
etc.) for the drive before VMS lost contact and after VMS reestablished contact with the drive 
during mount verification. 

Tivjc g<uniign£g jg j-grkogi^gd until success or timeout. This sequence is made evident b v the drive 
having a port light on and the Ready light blinking quite slowly as the controller accesses the FCT 
for the on line and LBN block for the media ID, effectively doing full-stroke seeks. 

The MVTIMEOUT system parameter defines the time (in seconds) allowed for a pending mount 
verification to complete before it is aborted. This dynamic parameter should always be set to a 
reasonable value for the typical operations at the site. 

NOTE 

Do not use values less than the recommended default of 600 seconds (10 minutes). 

After a mount verification times out, any pending I/O requests to the volume will fail. Try to 
execute the DISMOUNT/ABORT command which allows a subsequent mount to be successful if the 
MV-timer has previously expired. In some extreme cases, drive failures may require a reboot of the 
controller; some require a reboot of the system. 

Entry and exit to or from MOUNT VERIFY are time stamped. VAXcluster time stamps may vary 
across the cluster nodes due to differences in the TOY clocks and the initial clock times. Slight 
variations in time stamps do not indicate multiple drive or controller failures causing MOUNT 
VERIFICATION, but rather one drive or controller failure causing every node to enter MOUNT 
VERIFICATION at their own locally specified time. 

Some reasons why a drive enters mount verification: 

• Disk drive dropped off line because of: 

- Port switch glitch. 

- Drive fault. 

- Lost communications with controller or cable fault (drive temporarily went away and came 
back). 

• Drive status changed (operator physically did something with the drive). 

ft imA«A^/M» Anam/VAn vma*«*a (na/tVl irtttiieinei av*^**t 1%«*4 tttiII vtarrai* n/wift *\ I a4" a 1 

v/^j.auui v/imii^^u ui^iid \pava/ \^a«io€o 6ju.irj.jr jlsulv Wxaj. jlx^v^jl vvuipi^i^/. 

• Someone accidentally pushed the Write Protect switch. 

By noting the time duration of the mount verification and other circumstances surrounding the 
mount verify status, you can determine some valuable troubleshooting information. 

How long did the mount verify take? 

Less than MVTIMEOUT and the drive eventually succeeded. 

A few seconds — implying a glitch or a recoverable fault. 

Did the drive appear on another controller after the mount verification? If so, it could be 
a port-related problem. 

Thirty seconds to a minute to remount probably means the drive was spun down and had to be 
spun back up. Was this due to a drive fault? Did it run its spin-up diagnostics error free? 

Infinite time probably means that, along with the drive disappearing, it also changed its mediaJED, 
or it is a different drive, or it continually mils its spin-up diagnostics, or there is a hard fault on the 
drive. 

What happened? 

VMS does not log errors during the MOUNT VERIFY process, although it may log some before or 
after, depending on how the drive failed. 
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Did the drive see a fault during this period? (Examine the drive internal error log for error 
information.) 

Were any errors logged to the host or HSC console log before or after the mount verify? 

Is it always the same drive? 

Do any nonexistent drive numbers appear which may characterize a unit select problem? 

Was there a last-fail packet from the xDA/xDB shortly after, meaning the controller 
faulted/initialized as well? 

Did all the drives on a port/K/controller fail? 

5.15.3 Non-VMS Mount Verification 

RSTS 9.x is tolerant of DSA drives dropping off line. It reinitializes the drive and puts it back on 
line. Most other drives remain off line unless the driver is patched to reissue on line before every 
command (as RSX does). 

5.16 Troubleshooting ECC Errors on RA90/RA92 Disk Drives 

Disks are getting bigger and faster. As disk bit and track density increases, the electronics and 
mechanical components of the subsystem operate under tighter constraints. This means that error 
recovery mechanisms within the architecture may be called upon more frequently to compensate for 
these narrow tolerances. 

This is one of the significant advantages of a Digital storage solution. Digital integrates into the 
design of the controller and the drive error recovery attributes that enhance and ensure data 
integrity and delivery to the user. Plug-compatible manufacturers (PCMs) of storage devices, by not 
owning the design of both ends of the subsystem (controller and drive), are left with little capacity 
to implement such techniques. 

The RA90/RA92 disk drive has 14 different error recovery mechanisms (reference Appendix B) and, 
therefore, affords excellent recovery potential for data errors. These error recovery mechanisms 
provide the margins necessary to protect customer data at increased densities and to ensure that 
the data is always delivered successfully. 

In order to better determine the significance of logged correctable and uncorrectable ECC errors, 
and for assistance in troubleshooting either, note the discussions and error log examples in the 
sections that follow. 

5.16.1 Uncorrectable ECC Errors— MSCP Status/Event E8 

An uncorrectable ECC error is architecturally defined as the occurrence of a controller logging an 
MSCP status/event E8 as a result of a read data error. There are two uncorrectable ECC error 
types: hard and soft. Both types are reflected by a single MSCP status/event code. 

The next two sections attempt to aid the engineer in detennining/distinguishing between whether 
the status/event was hard or soft and significant or insignificant. 

5.16.1.1 Hard Uncorrectable ECC Errors 

A hard uncorrectable ECC error is the occurrence of an uncorrectable ECC error that renders the 
drive unable to recover data through any retry or recovery mechanism. An uncorrectable ECC error 
is not considered "hard" until all attempts at getting the data are exhausted and the controller has 
to terminate its attempts. 

Example 5-3 shows a VMS error log error packet where the data was lost due to a hard error. The 
fields of note are emphasized in bold. 
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******************************* ENTRY 
ERROR SEQUENCE 3885. 
DATE/TIME 30-JAN-1989 19:54:03.77 
SCS NODE: PICKUP 

ERL$LOGMESSAGE ENTRY KA750 REV# 14. 

I/O SOB-SYSTEM, ONIT _HSC013$DUA36: 

MESSAGE TYPE 0001 



23. ******************************* 

LOGGED ON: SID 0200620E 
SYS_TYPE 00000000 

UCODE REV# 98. 



MSLG$L_CMD_REF AF66000F 

MSLG$W_UNIT 0024 

MSLG$W_SEQ_NUM 0054 

MSLG$B_FORMAT 02 

MSLG$B FLAGS E0 



MSLG$W_SVENT 

MSLG$Q CNT ID 



00E8 



0000F20D 
01010000 



DISK MSCP MESSAGE 

UNIT #36, 

SEQUENCE #84. 

DISK TRANSFER ERROR 

BAD BLK REPLACEMENT REQUEST 
OPERATION CONTINUING 
OPERATION SUCCESSFUL 

DATA ERROR 
UNCORRECTABLE ECC ERROR 



MSLG$B_UNIT_SVR OB 

MSLG$B_UNIT_HVR 01 

MSLG$B_LEVEL 01 

MSLG$B_RETRY 05 

MSLG$L_VOL_SER 000003 6C 
MSLG$L_HDR_CODE 000E75BD 

CONTROLLER DEPENDENT INFORMATION 



ORIG ERR 



ERR RECOV FLGS 



8010 



0003 



LV1 A RETRY 


00 


LV1 B RETRY 


00 


BUF DAT MEM ADR 


C41B 


SRC REQ # 


03 


DET REQ # 


03 



UNIT SOFTWARE VERSION #11. 

UNIT HARDWARE REVISION #1. 
^*mmt rmtxy Imvml aftwe 
<imrokiag l«v*ls 14 throng 2. 
^Pifth xmtry wmm atfc w ipt ad 
<afc thm abov* retry l«v*l 

VOLUME SERIAL #876, 

LOGICAL BLOCK #947645. 
GOOD LOGICAL SECTOR 



EDC ERROR 
ECC ERROR 

LBN REPLACEMENT INDICATED 

ERR LOGGED TO CONSOLE AND HOST 



****************************************************** 



Example 5-3 VMS Uncorrectable ECC Error Log— Hard 
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The disk subsystem will attempt to recover from an uncorrectable ECC error by retrying the 
transfer five times. For an RA90/RA92 disk drive, the controller would then invoke drive recovery 
level 14 and execute that recovery mechanism up to five times, then invoke drive recovery level 13, 
and so on, until executing the last recovery level (1). 

Note that for UDA controllers, the reported recovery levels from the controller will differ from what 
the other controllers will report. 

5.16.1.2 Soft Uncorrectable ECC Errors 

A soft uncorrectable ECC error is the occurrence of an uncorrectable ECC error on the first read 
attempt; however, a successful recovery level and/or retry was made and the data was read 
successfully (with eight or less symbols in error). In such a case, the block is flagged as a BBR 
candidate for testing purposes by the HSC controller (or in case of a UDA/KDA/KDB controller, the 
host operating system driver). 

For uncorrectable ECC errors (MSCP status/event E8), the following items should be considered: 

• For the RA90/RA92 disk drive, examine the error log and determine that the MSLG$_LEVEL 
and MSLG$_RETRY (for VMS) is being reported as follows: 

If the recovery level is reported as and the retry count is =1 for the uncorrectable ECC errors, 
an occasional error under high I/O rates may be considered normal. The normal recovery will 
occur on the first retry with a recovery level of 0. If more than a single retry is necessary, 
and especially if other levels of recovery are necessary, this indicates potentially more serious 
error conditions, including the legitimate condition whereby a block is going bad and needs 
replacement. 

The RA90 short-arm HDA and the RA92 HDA will show improved (decreased) ECC error rates. 
The nominal distribution of uncorrectable ECC errors for an RA90 disk drive with a long-arm 
HDA operating at very high I/O rates should appear as follows: 

— Ninety percent of the errors occur in the top five heads (heads through 4). 

— One of the heads (in the 0-4 range) will have no errors logged. 

— At least three of the top five heads will have errors of this type. 

— You should have a sample size of at least 16 uncorrectable ECC errors for examination. If 
this distribution of errors is not met, then further analysis should be done. 

For example, if 10 of the 13 heads are logging these data errors, then consider it a general 
read path problem and troubleshoot accordingly. 

If distribution is to a single head, then consider the likelihood of a defective HDA. 

If error log information indicates that data recovery was accomplished by utilizing a 
drive error recovery level of 7 through 14 (head offset mechanism), then consider HDA 
replacement (especially if 9A, 9B, or 9C errors are being logged in the drive as well). 

• Each error log entry of an uncorrectable ECC error should be followed by a BBR packet 
(reference Section 5.16.2.1). The MSCP status/event code should reflect a 34, BBR replacement 
attempted but block tested okay. Blocks in a normal drive will be retired at a very low rate (less 
than 20 percent of the time) for the normal transient occurrence of uncorrectable ECC errors on 
RA90 disk drives. 

Example 5—4 has three fields of note (emphasized in bold). The first emphasized field denotes the 
actual MSCP status/event logged (00E8), and a bit-to-text decode denoting that the read error was 
an uncorrectable ECC error. 

The second field of note indicates how the subsystem recovered from the error condition; in this 
case, a single retry was successful with no special error recovery mechanism being invoked to aid in 
the recovery of the data. 
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The third emphasized field is the field within an error log packet that, for an ECC-type MSCP 
status/event packet, typically has no meaning and will in most all cases indicate zeros. This section 
of an errorlog packet will, however, contain significant information for the interpretation of MSCP 
status/event 6B error packets. 



******************************* ENTRY 
ERROR SEQUENCE 3885. 
DATE/TIME 30-JAN-1989 19 i 54: 03. 77 
SCS NODE: PICKUP 

ERL$LOGMESSAGE ENTRY KA750 REV# 14. 

I/O SUB-SYSTEM, UNIT _HSC013$DUA36: 

MESSAGE TYPE .0001 



29. ******************************* 

LOGGED ON: SID 0200620E 

SYS_TYPE 00000000 

UCODE REV# 98. 



MSLG$L CMD RBF AF 6600 OF 



MSLG$W_UNIT 


0024 


MSLG$W_SEQ_NUM 


0054 


MSLG$B_FORMAT 


02 


MSLG$B_FLAGS 


E0 


MSLG$W_EVEET 


00E8 


MSLG$Q_CNT_ID 


0000F20D 
01010000 


MSLG$B_UNIT_SVR 


0B 


MSLG$B_UNIT_HVR 


01 


MSL6SB LEVEL 
MSLS$S RETRY 


00 



MSLG$L_VOL_SER 0000036C 
MSLG$L_HDR_CODE 000E75BD 

CONTROLLER DEPENDENT INFORMATION 



ORIG ERR 



ERR RECOV FLGS 



8010 



0003 



L¥i A RETRY 


00 


LV1 B RETRY 


00 


BUF DAT MEM ADR 


C41B 


SRC REQ # 


03 


DET REQ # 


03 



DISK MSCP MESSAGE 

UNIT #36. 

SEQUENCE #84. 

DISK TRANSFER ERROR 

BAD BLK REPLACEMENT REQUEST 
OPERATION CONTINUING 
OPERATION SUCCESSFUL 

DATA ERROR 
UNCORRECTABLE ECC ERROR 



UNIT SOFTWARE VERSION #11. 

UNIT HARDWARE REVISION #1. 
^JO Drive Recovery Invoked 
<sia«i» retry %ms successful! 
■^finimal impact event 

VOLUME SERIAL #876. 

LOGICAL BLOCK #947645. 
GOOD LOGICAL SECTOR 



EDC ERROR 
ECC ERROR 

LBN REPLACEMENT INDICATED 
ERR LOGGED TO CONSOLE AND HOST 
< For date problems, these 
<£ields should contain ' zeros' 



*********************************************** 



Example 5-4 VMS Uncorrectable ECC Error Log— Soft 
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5.16.2 Correctable ECC Errors— MSCP Status/Event Codes 1 A8, 1C8, 1E8 

Correctable ECC errors are those where the data was read with symbols in error above the drive 
threshold (6-8 symbols for the RA90/RA92 disk drive). For ECC errors (MSCP status/event codes 
1A8, 1C8, and 1E8), consider the following: 

• For an RA90 disk drive with a long-arm HDA, an occasional ECC error (including 6-8 symbols 
in error and soft uncorrectable errors) may be considered normal when the drive has sustained 
or I/O burst rates of >30 I/Os per second. 

The RA90 short-arm HDA and the RA92 HDA show a marked improvement (decrease) in ECC 
error rates. 

The nominal distribution of correctable ECC errors for an RA90 disk drive with a long-arm 
HDA should appear as follows: 

- Ninety percent of the errors occur in the top five heads (heads through 4). 

- One of the heads (in the 0-4 range) will have no errors logged. 

— At least three of the top five heads will have errors of this type. 

— You have a sample size of at least 16 uncorrectable ECC errors for examination. If this 
distribution of errors is not met, then further analysis should be done. 

For example, if 10 of the 13 heads are logging these data errors, then consider it a general 
read path problem and troubleshoot accordingly. 

If distribution is to a single head, then consider the likelihood of a defective HDA. 

If error log information indicates that data recovery was accomplished by utilizing a 
drive error recovery level of 7 through 14 (head offset mechanism), then consider HDA 
replacement (especially if 9A, 9B, or 9C errors are being logged in the drive as well). 

• Each error log entry of an ECC (6-8 symbol) error should be followed by a BBR packet 
(reference Section 5.16.2.1). The MSCP status/event code should reflect a 34, BBR replacement 
attempted but block tested okay. Blocks in a normal drive will be retired at a very low rate 
(less than 20 percent of the time) for the normal transient occurrence of correctable ECC errors 
on RA90 disk drives. 

5.16.2.1 BBR Packet 

ECC errors that exceed the drive threshold initiate BBR algorithms. The BBR algorithms are 
provided to test, verify, and replace (if needed) defective media spots or marginal media/head spot 
combinations (assuming no data path problems). In those instances where the BBR algorithms do 
not determine a need for block replacement, it may be due to a transient type error situation, 
or mechanisms not attributable to actual head/media margins. These above-drive-threshold 
ECC errors (or uncorrectable ECC errors) may be caused by drive phenomena other than bad 
media/heads. 

The BBR packet, which is generated at the completion of the BBR algorithm, will contain several 
important clues about the nature of the ECC error. Included in the packet is whether the block 
tested good or bad, and whether the original data was recovered or restored with the FORCED 
ERROR flag set, indicating the data was lost. 

The following MSCP status/event codes are applicable for a BBR packet: 

MSCP status/event 14 — Bad block successfully replaced. 
MSCP status/event 34 — Block verified okay; not a bad block. 
MSCP status/event 54 — Replacement failure; replace command failed. 
MSCP status/event 74 — Replacement failure; inconsistent RCT. 
MSCP status/event 94 — Replacement failure; drive access failure. 
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MSCP status/event B4 — Replacement failure; no block available. 

MSCP status/event D4 — Replacement failure; two successive RBNs were bad. 

Example 5-5 illustrates what the status of the BBR replacement algorithm resulted in. In this 
example, the block in question did go through BBR; however, the block was not replaced* Further 
in the example, the replace flags demonstrate that the block was not replaced because the block 
"verified good." The last segment of the BBR log packet reveals why the block was even tested. In 
this example, the block was thought to contain a data error with a severity level of "uncorrectable 
ECC." 

5.17 Troubleshooting Controller-Detected Positioner Errors— MSCP 
Status/Event 6B 

MSCP status/event 6B is a positioner unintelligible header error (also referred to as a positioner 
error mis-seek). Several considerations must be weighed when troubleshooting the MSCP 6B event. 
These include: 

For RA90/RA92 disk drives, what is the I/O rate on the drive? 

Is only one SDI path noting the problem? 

Are other errors being logged at or near the same frequency as the MSCP 6B? 

For RA92 disk drives, what is the write-to-read ratio? 

What recovery level/mechanism is the controller using in order to recover from the situation? 

With the RA90/RA92 disk drive, if in the examination of the error log, it can be determined that: 

the Level A retry mechanism is successful on first retry, and 

the Level B retry mechanism is not being used (reported Level B retry count = 0), and 

"all" errors are being recovered on a single retry, 

then an error rate of six per day may be considered nominal for the RA90/RA92 disk drives 
operating near or above 30 I/Os per second. 

Example 5-6 illustrates a typical RA90 error log on a VMS system. The fields of note are 
emphasizeu in uOiu. 

5.17.1 RA92 Disk Drive With MSCP Status/Event 6B 

RA92 disk drives may log more occurrences of MSCP status/event 6B than RA90 disk drives in 
applications during which long sequences of write activity are occurring. This phenomenon, as a 
contributor to 6B events, was recently discovered and identified. Though it occurs more often with 
the RA92 disk drive, heavy write-to-read ratios could be a contributor to logged MSCP 6B events by 
RA90 disk drives. 

The problem is occurring within the design of the heads while the head is involved in large 
sequential write transfers. When the head has to switch back to read (for next header 
identification), noise can result in the head that essentially disrupts the header signal as it is 
read. No identifiable damage to the actual header information is exhibited on the media. Customer 
data is not at risk. The noise merely disrupts the read chain momentarily as the header is being 
read. By the time the next sector comes around, the read chain will have stabilized. 

This head phenomenon will result in additional 6B errors being logged when the write-to-read 
ratios are heavily weighted in favor of writes. Typical VMS environments may not provide this 
scenario. It has been noted that typical ULTREX/UNIX applications appear to have a higher mix of 
write-to-read activity than VMS applications. However, regardless of the operating system, certain 
applications may increase the potential of this phenomenon occurring when those applications, by 
their nature, offer heavy write-to-read ratios. 
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****** ENTRY 6., ERROR SEQUENCE 4709. LOGGED ON SID 05283914 

ERL$LOGMESSAGE ENTRY KA820 REV# E PATCH REV# 28. DCODE REV# 20. 

BI NODE # 2. 
I/O SUB-SYSTEM, UNIT HSC015$DUA36: 



MESSAGE TYPE 



0001 



MSLG$L CMD REF 6BBC000A 



MSLG$W_UNIT 


0024 


MSLG$W_SEQ_NUM 


0002 


MSLG$B_FORMAT 


09 


MSLG$B_FLAGS 


80 


MSLG$W_BVENT 


0034 


MSLG$Q_CNT_ID 


0000FC15 
01200000 



MSLG$B_CNT_SVR 
MSLG$B CNT HVR 



27 
00 



MSLG$W_MULT_UNT 0060 

MSLG$QJUNIT_ID 000003F6 

02130000 



MSLG$B_UNIT_SVR 0B 

MSLG$B_UNIT_HVR 01 

MSLG$W_RPL_FLGS 0000 

MSLG$L_VOL_SER 00 00 03 6C 

MSLG$L BAD LBN 00175A52 



DISK MSCP MESSAGE 

UNIT #36. 

SEQUENCE #2. 

BAD BLOCK REPLACEMENT ATTEMPT 

OPERATION SUCCESSFUL 

BAD BLOCK REPLACEMENT 
BLOCK VERIFIED GOOD 

UNIQUE IDENTIFIER, O0OO0OO0FC15 (X) 

MASS STORAGE CONTROLLER 

HSC70 

CONTROLLER SOFTWARE VERSION #39. 

CONTROLLER HARDWARE REVISION #0. 



UNIQUE IDENTIFIER, 0000000003F6 (X) 

DISK CLASS DEVICE (166) 

RA90 

UNIT SOFTWARE VERSION #11. 

UNIT HARDWARE REVISION #1. 

REPLACEMENT ATTEMPTED, BLOCK 
_ VERIFIED GOOD 

VOLUME SERIAL #876. 

BAD LOGICAL BLOCK 
NUMBER » 1530450. 



MSLG$L_OLD_RBN 00000000 
MSLG$L_NEW_RBN 000056A4 
MSLG$W_CAUSE 00E8 

DATA ERROR 

UNCORRECTABLE ECC ERROR 

******************************************* 



Example 5-5 VMS BBR Packet 
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VMS LOGGED MSCP ' 6B' POSITIONER ERRORS 
******************************* ENTRY 

ERROR SEQUENCE 2151. 

DATE/TIKE 26-JUL-1990 11:12:49.31 



1. ******************************* 



LOGGED ON: 



SID 1105009C 
SYS TYPE 00000000 



ERLSLOGMESSAGE ENTRY KA88 REV# 5. 

CPU # 0. 

I/O SUB-SYSTEM, UNIT _HSC4$DUA39: 



CPU 0, 



MESSAGE TYPE 


0001 


MSLG$L_CMD_REF 
MSLG$W_UNIT 


56310024 
0027 


MSLG$W_SEQ_NUM 


001B 


MSLG$B_FORMAT 


02 


MSLG$B_FLAGS 


81 


MSLG$W_EVENT 


006B 


MSLG$Q_CNT_ID 


0017F20D 
01010000 



CONTROLLER DEPENDENT INFORMATION 
ORIG ERR 1800 



ERR RECOV FLGS 0002 

V71 A RETRY 01 

LV1 B RETRY 00 

BUF DAT MEM ADR C4BF 

SRC REQ # 02 

DET REQ # 02 



DISK MSCP MESSAGE 

UNIT #39. 

SEQUENCE #27. 

DISK TRANSFER ERROR 

SEQUENCE NUMBER RESET 
OPERATION SUCCESSFUL 

DRIVE ERROR 

POSITIONER ERROR (MIS-SEEK) 



UNIQUE IDENTIFIER, 00000017F20D(X) 

MASS STORAGE CONTROLLER 

HSC50 



HEADER COMPARE ERROR 
HEADER SYNC TIMEOUT 
SUSPECTED LOW HEADER MISMATCH 

ERR LOGGED TO CONSOLE AND HOST 

< NOTE 1 "A" RETRY 

< MOTE NO "B" RETRIES 



Example 5-6 Positioner Ms-Seek MSCP Status/Event 6B 
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The occurrence of 6B errors caused by this phenomenon has been more pronounced on the 
KDM/HSC controllers than on the KDA/KDB/UDA controllers. Since experience and engineering 
evaluation have shown that the occasional occurrence of the MSCP status/event 6B, when recovered 
on a single retry, is inconsequential, extra error management code has been implemented as follows: 

* HSC software released after the 39x series will contain special 6B error management code that 
will look for this error signature and will not report this event characteristic of the RA90/RA92 
product. 

* The KDM70 controller with microcode at revision level 2 will also contain this enhanced error 
management code for 6B errors on RA90/RA92 disk drives. 

This phenomenon is being aggressively pursued by Digital and resolution details will be 
communicated to the field. 

5.17.2 Evaluating MSCP 6B Events 

When converting some (20-30 LBNs identified as 6B MSCP events) of the target LBN numbers, 
look for the following: 

Single head but quite random cylinder addresses — consider the HDA. 

Single head but narrow band of cylinder addresses— consider mapping out suspect LBNs with 
DKUTIL or HDA replacement. To manually force replacement of a perceived bad block, make 
sure a current disk backup exists. 

Repeating LBNs — consider "mapping*' out suspect LBNs with the BBR utility (DKUTIL). 

Random heads (10 of 13 heads) — consider data path including controller SDI module. 

Troubleshoot MSCP status/event 6B as follows: 

Update the drive with the latest drive microcode version. 

If errors are only happening on one port, pursue a port path problem, including ECM, SDI 
cables between drive and bulkhead, cabinet to controller cabinet, and within the controller 
cabinet and the port interface module in the controller. 

Note whether more than one drive on the requester is reporting consistent 6B events. This 
would more definitely suggest a port interface problem within the controller. 

If errors are clearly happening on both drive ports, pursue the problem as a drive problem 
first, when the event rate exceeds the guidelines indicated above and/or customer satisfaction 
dictates. 

5.18 Conclusion 

The DSA architecture defines a very reliable and flexible storage subsystem. This subsystem can be 
maintained efficiently and effectively when consistent and methodical troubleshooting procedures 
are followed. 

Poorly trained or untrained Customer Services engineers are at a serious disadvantage. The cost 
of supporting incorrectly identified FRUs is very high. Many of the FRU units are expensive to 
replace. Some very expensive FRUs are not repairable FRUs. The impact to a customer can be 
substantial. Impacts include: 

* Necessity to back up and restore potentially large amounts of data on misdiagnosed HDA 
replacements. 

* Loss of system availability when using standalone diagnostics with controllers such as 
UDA/KDA/KDB. 

* Loss of drive availability when performing extensive subsystem diagnostics using an HSC 
controller. 
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• Increased frustration and inconvenience of dealing with repeated calls. 

• Loss of confidence in Digital as a quality supplier of storage systems. 

• Increased potential of data loss if improper diagnosis is made and the failure mode continues or 
gets worse. 

SERVICE GOAL 

The Customer Services engineer's number one goal in service efforts is to correctly 
diagnose a problem on the first call and replace the correct part so the customer's disk 
and data availability is minimally impacted. 

5.1 9 Error Codes and Descriptions 

This section describes RA90/RA32 disk drive error codes. Included in each error code description is 
a list of suggested replacement FRUs for repairing drive problems. 

Careful analysis of both system and drive internal error logs, along with drive-generated error 
codes, should lead to problem isolation and correction. 

Error codes are listed in hex numerical order starting with error code 01 through error code FD 
(hex). The general format of the error code listings is as follows: 



01 O Spindle Motor Transducer Timeout 

Error Type: DE 

Error Description: The spindle was given the command to spin up by an SDI command 
or from the front panel Run switch and no movement was detected by the spindle motor 
transducer. See error code 13 for possible isolation help before replacing FRUs. © 

Fault Isolation/Correction: 

1. ECM 

2. HDA 

3. Rear flex cable assembly 

Where: 

O 01 is the error code. 

SPINDLE MOTOR TRANSDUCER TIMEOUT is the error message. 

© DE is the error type. 

© Error Description: is a brief summary of the error event. 

© Fault Isolation/Correction: is the suggested FRU replacement order for 
troubleshooting. 
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01 Spindle Motor Transducer Timeout 

Error Type: DE 

Error Description: The spindle was given the command to spin up by an SDI command 
or from the front panel Run switch, and no movement was detected by the spindle motor 
transducer. See error code 13 before replacing FRUs. 

Fault Isolation/Correction: 

1. ECM 

2. HDA 

3. Rear flex cable assembly 

02 Spinup Too Slow 

Error Type: DE 

Error Description: The spindle did not reach 1000 r/min within 20 seconds. See error code 13 
before replacing FRUs. 

Fault Isolation/Correction: 

1. ECM 

2. HDA 

3. Rear flex cable assembly 

03 Spindle Not Accelerating During Spinup 

Error Type: DE 

Error Description: The spindle did not accelerate above 1000 r/min in the allotted spinup 
timeout period. See error code 13 before replacing FRUs. 

Fault Isolation/Correction: 

1. ECM 

2. HDA 

3. Rear flex cable assembly 

04 Spinup Too Long to Lock on Speed 

Error Type: DE 

Error Description: The spindle did not reach 3600 r/min (± 18 r/min) within 30 seconds. See 
error code 13 before replacing FRUs. 

Fault Isolation/Correction: 

1. ECM 

2. HDA 

3. Rear flex cable assembly 
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05 Invalid Drive Serial Number Code 

Error T^pe: DF 

Error Description: The drive serial number is out of acceptable range or an invalid 
manufacturing plant code was read by the drive microcode. 

Switches are set (or read) incorrectly on the rear flex cable assembly (S1/S2). This is neither a 
fatal error nor a hard error. Clearing the mult allows the drive to continue operation. The drive 
serial number is checked during the power-up sequence. 

Table 5-6 Serial Number 



Bits 

<19:18> 



M% 



Serial Number Raage 
in Decimal 



Max Binary Value Bits <17:00> 



CX 0-262,143 

CX 262,144-309,999 

310,000-524,287 
KB 0-262,143 

0-262,143 



1111111111111111111 
001011101011101111 
1111111111111111111 invalid 
1111111111111111111 
1111111111111111111 invalid 



Fault Isolation/Correction: 

1. Incorrect S1/S2 bits set on rear flex cable assembly 

2. Rear flex cable assembly 

3. ECM seating problem 

4. ECM 

06 Microcode Fault 

Error Type: DF 

Error Description: A hardware/software failure caused the master processor addressing to 
point to a null EEFROM area. 

Fault Isolation/Correction: 

1. Reload drive microcode 

2. ECM 

07 SDI Frame Sequence Error 

Error T^pe: RE 

Error Description: Level 1 SDI commands were detected in the wrong sequence. If the same 
drive is reporting errors from two controllers, start troubleshooting at the drive. 

Fault Isolation/Correction: 

1. Controller 

2. SDI cable 

3. ECM 
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08 SDI Lvl 2 Checksum Error 

Error Type: RE 

Error Description: The calculated checksum did not compare with the checksum field sent by 
the controller to the drive for SDI level 2 commands. If the same drive is reporting errors from 
two controllers, start troubleshooting the drive. 

Fault Isolation/Correction: 

1. Controller 

2. SDI cable 

3. ECM 

09 SDI Lvl 1 Framing Error 

Error Type: RE 

Error Description: A sync pattern was detected by the drive on the SDI WRITE/COMMAND 
line, but no SDI level 1 control message transmission or single frame command was detected. 

Fault Isolation/Correction: 

1. Controller 

2. SDI cable 

3. ECM 

0A SDI Incorrect Command Opcode Partly Error 

Error Type: PE 

Error Description: The wrong parity was detected on the opcode byte of a level 1 or level 2 
command. 

Fault Isolation/Correction: 

1. Controller 

2. SDI cable 

3. ECM 

OB SDI Invalid Opcode 
Error Type: PE 

Error Description: The decoded opcode is not a valid (level 2) SDI opcode. 
Fault Isolation/Correction: 

1. Controller 

2. SDI cable 

3. ECM 
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OC SDI Command Length Error (LVL2) 

Error Type: RE 

Error Description: This error indicates the controller caused the drive SDI input command 
buffer to overflow. 

Fault Isolation/Correction: 

1. Controller 

2. SDI cable 

3. ECM 

OD SDI Invalid Command with Drive Error 

Error Type: PE 

Error Description: The controller issued an INITIATE SEEK command, an ERROR 
RECOVERY command, or a RECALIBRATE command while the drive was faulted. 

Fault Isolation/Correction: 

1. Controller 

2. ECM 

3. SDI cable 

OE SDI Lvl 1 Invalid Select Group Number 

Error Type: RE 

Error Description: Indications are the controller attempted to select a nonexistent group. For 
RA90 and RA92 disk drives, group=head. 

Fault Isolation/Correction: 

1. Controller 

2. ECM 

3. SDI cable 

OF SDI Write Enable on a Write-Proteeted Drive 

Error Type: PE 

Error Description: A drive write-protected from the OCP (front panel) was issued a WRITE 
ENABLE command through an SDI CHANGE MODE command. The OCP switch state has 
priority over any SDI CHANGE MODE commands. 

Fault Isolation/Correction: 

1. Disable Write Protect switch 

2. Controller 

3. ECM 

4. OCP 
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10 SDI Command Length Error (LVL2) 

Error Type: PE 

Error Description: An SDI command length error, LVL2, indicates the number of bytes 
expected did not equal the number of bytes received for an SDI level 2 command. 

Fault Isolation/Correction: 

1. Controller 

2. SDI cable 

3. ECM 

11 Microcode Cartridge Load Occurred 

Error Type: Informational Only 

Error Description: This logged event indicates that a drive microcode update successfully 
occurred. 

This new event occurred with the introduction of the Etch-F I/O-R/W module. Etch-F revision 
ECM boards are indicated by revision 1 or later in the IOP and SRV values reported with drive 
internal test T45. (There are a minimal number of Etch-E revision modules that provide this 
information.) 

Fault Isolation/Correction: 

1. Information only 

12 Spindle Speed Unsafe Error 

Error Type: DE 

Error Description: During idle loop, a spindle speed check indicated the drive was not up to 
speed at 3600 r/min (± 18 r/min). The servo processor will also detect this condition dynamically 
and have the master processor log this error as well. 

Fault Isolation/Correction: Disabling the brake circuit may aid in troubleshooting. The 
brake can be disabled by opening either pin 4 or 5 of the rear HDA connector. Use the pin 
extraction tool (P/N 29-26655-00) to avoid breaking pins. 

CAUTION 

The female pins in the HDA connector are delicate and must be handled with care. 

When disabling the brake, cover loose pins with electrical tape to prevent them from 

shorting. 

1. Reseat HDA 

2. ECM 

3. Power supply 

4. Brake 

5. HDA 
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13 Spindle Motor Control Fault 

Error Type: DE 

Error Description: The motor control IC detected a condition that prevented the spindle from 
getting up to speed. 

Fault Isolation/Correction: 

1. Reseat ECM/HDA 

2. ECM 

3. HDA 

A number of checks are made to detect this fault. A failure of any of the following checks 
results in this error: 

1. If no Hall effect is seen within 700 ms after current is applied to the spindle motor. 

2. If the SSI chip on the servo module which controls spindle speed rotation is operating at 
less than 6.8 volts. 

3. If the brake circuit is activated at the same time that current is applied to the spindle. 

4. If the Hall sensor input from the spindle motor is not occurring at a 700 ms rate. 

Additionally, any open condition in the spindle circuitry, including Hall sense phase or spindle 
motor phase circuitry, causes this error to be asserted. 

Although power supply voltages cannot be adjusted, they can be measured by removing the 
small cover as shown in Figure 5-8 (power supplies bearing a serial number starting with CX 
only). On the back of the connector, the pin numbers are visible. A very small electrical probe 
is required to make connection. 



POWER SUPPLY 



HOLD-DOWN 
SCREWS 




QUARTER-TURN 



POWER SUPPLY 
ACCESS COVER 



CXO-2184B 



Figure 5-8 Power Supply Cover Removal 
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Removal of this cover allows access to the power supply output voltage connector. lb remove 
the power supply cover, use a quarter-inch hex driver. Remove the hold-down screws. Next, use 
a DVM or oscilloscope to measure the points to ground (black lead) as shown in Table 5-9. 



Table 5-9 Power Supply Voltage Measurements 



Pin 



Wire Color 



Voltage Measurement 



Deviation 



1 


Orange 


+12Vdc 


±.6V 


2 


Black 


±12 Vdc (return) 




3 


Black 


±12 Vdc (return) 




4 


Blue 


-12 Vdc 


±.6V 


5 


Red 


+5.1 Vdc 


±.25 V 


6 


Red 


+5.1 Vdc 


±.25 V 


7 


Red 


+5.1 Vdc 


±.25 V 


8 


Red 


+5.1 Vdc 


±.25 V 


9 


Black 


+5.1 Vdc (return) 




10 


Black 


+5.1 Vdc (return) 




11 


Black 


+5.1 Vdc (return) 




12 


Black 


+5.1 Vdc (return) 




13 


Purple 


-5.2 Vdc 


±.17 Vdc 


14 


Purple 


-5.2 Vdc 


±.17 Vdc 


15 


Brown 


-24 Vdc 


±2.4 Vdc 


16 


Brown 


-24 Vdc 


±2.4 Vdc 


17 


Brown 


-24 Vdc 


±2.4 Vdc 


18 


Black 


±24 Vdc (return) 




19 


Black 


±24 Vdc (return) 




20 


Yellow 


+24 Vdc 


±2.4 Vdc 


21 


Yellow 


+24 Vdc 


±2.4 Vdc 


22 


Yellow 


+24 Vdc 


±2.4 Vdc 


23 


Brown 


40 kHz H 




24 


Blue 


-5.2 Vdc (sense) 




25 


Black 


-5.2 Vdc (sense return) 




26 


Orange 


DCOKH 




27 


Red 


OVTEMPH 




28 


Blue 


POCKH 




29 


White 


ON/OFF L 





In addition to these measurements, error codes 2D and FF indicate power problems. 

Along with the power supply measurements, a number of resistance checks can be made to the 
HDA. The HDA must first be removed from the drive chassis. Exercise care when handling 
the HDA so that connector pins are not damaged during measurements. DO NOT jam probes 
into the connector housing from the front of the connector because it is easy to damage the pins 
in these sockets. Access the pins from the rear of the connector or use the pin insert/extract 
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tool (P/N 29-26655-00) to remove pins from connectors for easier measurements. Refer to 
Table 5-10 to locate opens in the circuits. 

Table 5-10 lists pin-to-circuit connections. 



Table 5-10 HDA Connector Pin Designations 



Pin 



yflf UTw ^w'^M^HT 



Circuit 



1 


Blue 


Positioner lock solenoid (-) 


2 


Blue 


Positioner lock solenoid (+) 


4 


White 


Brake (-) 


5 


White 


Brake (+) 


6 


Green 


S3 


7 


Violet 


S2 


8 


Flex 


Positioner actuator fix (-) 


9 


Orange 


SI 


10 


Flex 


Positioner actuator fix (+) 


11 


Brown 


Hall sensor ground 


12 


Gray 


Spindle motor coil C 


13 


Red 


Hall sensor 5 V input 


14 


Blue 


Spindle motor coil B 


16 


Black 


Spindle motor coil A 


Grnd 


Yellow 


Spindle motor lamination lead 



exits HDA and is grounded on HDA. 



Resistance measurements are checked according Table 5-11. 



Table 5-11 HDA Resistance Measurements 



<-)Phi to (+)Wn 



Circuit 



Measured Value 



16-14 


Coil A - Coil B 


1.4 ohm 


16-12 


Coil A - Coil C 


1.4 ohm 


14-12 


CoilB-CoilC 


1.4 ohm 


16 - HDA ground 


Coil A - ground 


20 megohm 


14 - HDA ground 


Coil B - ground 


20 megohm 


12 - HDA ground 


Coil C - ground 


20 megohm 


9-7 


S1-S2 


20 megohm 


9-6 


SI -S3 


20 megohm 


7-6 


S2-S3 


20 megohm 


9-13 


Sl-Hall5V 


20 megohm 


7-13 


S2-Hall5V 


20 megohm 


6-13 


S3-Hall5V 


20 megohm 


9-11 


SI • Hall ground 


>4.50 megohm 
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Table 5-11 (Cont.) HDA Resistance Measurements 



(-)Pin to (+)Rn 


Circuit 


Measured Value 


7-11 


S2 - Hall ground 


>4.30 megohm 


6-11 


S3 - Hall ground 


>4.50 megohm 


11-13 


Hall ground - Hall 5V 


»7 megohm 


1-2 


Positioner lock solenoid 


>30ohm 


8-10 


Actuator coil 


>4 ohms 



14 Head Offset Margin Event 

Error Type: DE 

Error Description: This is not an error condition. Manufacturing sets the enable flag for the 
detection of this event. If this code shows up in the field, reset the nag by taking the drive off 
line and powering it down and then up. 

Fault Isolation/Correction: 

1. Power drive off and back on. 

15 Head Offset Out-of-Band Error 

Error Type: DE 

Error Description: Head offset has exceeded normal head offset parameters for this drive. 
This is a serious problem. Data is in danger of being lost. Do not use the drive for further 
writes. Initiate prompt backup. Head offset errors can result from an over-temperature 
condition. Check drive airflow and ambient room temperature. If temperature appears to 
be normal, replace the HDA. 

The amount of offset necessary before this error is nagged is ±3/4ths of a track. After each 
offset table rebuild, the servo processor tests each h ead v alue against this threshold. If a head 
exceeds offset limits, the master processor asserts ATTENTION and uses the GET STATUS 
response to identify which head or heads are involved. 

The drive specific bytes of the drive internal error log should indicate which head has marginal 
offsets. 

Fault Isolation/Correction: 

1. HDA 

2. ECM 

3. PCM 

16 SDI Invalid Group Select LVL2 

Error Type: PE 

Error Description: The controller attempted to select a nonexistent group. A group refers 
to a head in the RA90 and RA92 disk drives. If the drive is dual-ported and logging this error 
from both controllers, troubleshoot the drive. 

Fault Isolation/Correction: 

1. Controller 

2. ECM 
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17 SDI Port A Command/Response Timeout 

Error Type: Informational Only 

Error Description: The Port A controller did not accept message response data from the 
drive. This is typically a communications event and not a drive error. 

Fault Isolation/Correction: 

1. Communications event (typically not a drive problem) 

2. Controller on Port A 

3. ECM 

18 SDI Port B Command/Response Timeout 

Error Type: Informational Only 

Error Description: The Port B controller did not accept message response data from the 
drive. This is typically a communications event and not a drive error. 

Fault Isolation/Correction: 

1. Communications event (typically not a drive problem) 

2. Controller on Port B 

3. ECM 

19 SDI Invalid Format Request 

Error Type: PE 

Error Description: The controller requested that the drive place itself in 576-byte format. 
The RA90/RA92 only accepts 512-byte format. This error can also be caused by someone trying 
to format the drive in 576-byte mode. 

Fault Isolation/Correction: 

1. Controller 

2. ECM 

1A SDI Invalid Cylinder Address 

Error Type: PE 

Error Description: The drive decoded a nonexistent cylinder address during a controller- 
initiated SEEK command. 

This error also occurs when a controller, while running diagnostics, attempts to test the DBN 
area of the disk without first setting the drive's DB bit. 

This error also occurs if an attempt is made to access cylinders beyond the DBN space if the DB 
bit is set. 

Fault Isolation/Correction: 

1. Controller 

2. ECM 
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1B Inner Guardband Error 

Error Type: DE 

Error Description: The drive hardware detected servo inner guardband information instead 
of servo data information or outer guardband information. The only time the servo head is 
positioned in the inner guardband area and does not generate an error is during execution of 
diagnostics. 

Fault Isolation/Correction: 

1. ECM 

2. HDA 

NOTE 

If an actuator current error or actuator speed error is also indicated, it is probable 

that the inner guardband error is secondary. Reference the respective actuator error. 

1C Outer Guardband Error 

Error Type: DE 

Error Description: Outer guardband information was decoded when servo or inner guardband 
information was expected. The only time the servo head is positioned in the outer guardband 
area and does not generate an error is during execution of a head load operation, a recalibrate, 
or internal diagnostics. 

Fault Isolation/Correction: 

1. ECM 

2. HDA 

NOTE 

If an actuator current error or actuator speed error is also indicated, it is probable 

that the inner guardband error is secondary. Reference the respective actuator error. 

1D Illegal Servo Fault 

Error Type: DE 

Error Description: A servo fault was detected by the GASP array; however, when the master 
processor examined the register information, the error was invalid. 

Fault Isolation/Correction: 

1. ECM 

IE Power-Up After AC Power Loss 

Error Type: Information only 

Error Description: Information event noting that the drive performed a power-up sequence 
after ac power loss. This may be the result of turning the drive power off at the breaker, or loss 
of ac power to the drive/cabinet. 

This new event occurred with the introduction of the Etch-F I/O-R/W module. Etch-F revision 
ECM boards are indicated by revision 1 or later in the IOP and SRV values reported with drive 
internal test T45. (There are a minimal number of Etch-E revision modules that provide this 
information.) 



DIGITAL INTERNAL USE ONLY 



Troubleshooting and Error Codes 5-65 



This event is different from the logged event as a result of the power supply being over 
temperature. ) 

Fault Isolation/Correction: 

1. Information Only 

1F Sector Overrun Error 

Error Type: DE 

Error Description: When a sector or index pulse occurs with either WRITE GATE or READ 
GATE asserted, an overrun error is asserted. This indicates a write or read operation was 
attempted through a sector/index boundary. 

Fault Isolation/Correction: 

1. Controller 

2. ECM 

20 SDI RTCS Parity Error 

Error Type: DE 

Error Description: A bit was dropped or picked up in data transferred on the SDI Real TSme 
Controller State (RTCS) line. 

Fault Isolation/Correction: 

1. Controller 

2. SDI cable 

3. ECM 

21 SDI Transfer (Pulse) Error 

Error Type: DE 

Error Description: An extra or missing pulse was detected on the SDI WRT/CMD line or the 
RTCS line. If this error occurs from both ports and/or more than one controller, troubleshoot 
the drive. If only one port is involved, troubleshoot the SDI cables or the controller. See Figure 
5-9. 
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Figure 5-9 WRT/CMD Data Format 
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On the WRT/CMD and RTCS lines, a positive transition at the leading edge of a bit cell 
indicates a one; a negative transition indicates a zero. If the next bit cell contains the same 
data (a one followed by a one or a zero followed by a zero), the line switches polarity in the 
middle of the bit cell. 

The error is detected by the TSID gate array and is passed to the SDI gate array as a PLS 
ERR error. A pulse error should only be reported when the drive is executing a data transfer 
operation. If a pulse error occurs during a TRANSFER command, PLS ERR will set bit of 
Fault Register 3 of the SDI gate array. 

Fault Isolation/Correction: 

1. ECM 

2. SDI cables 

3. Controller 

4. Power supply 

5. Spindle ground brush 

22 Electronic Control Module Over-Temperature Error 

Error Type: DE 

Error Description: An over-temperature condition exists in the drive. Drive over-temperature 
conditions result from high room temperature or a dirty air vent inhibiting airflow through the 
drive. Additionally, a bad blower motor could cause the internal temperature of the drive to 
increase, but a 2D error is more likely in this case. This over-temperature condition happens 
when the detector senses 43°C (110°F). 

Fault Isolation/Correction: 

1. Ambient air temperature is too high 

2. Cabinet door air vent needs cleaning 

3. Blower assembly 

4. ECM 

24 Loss of Fine Track During Data Transfer 

Error Type: DE 

Error Description: A loss of fine track was detected when a read or write operation was ready 
to begin, but not actually started. This error code is not implemented in microcode revision 7 
and later. 

Refer to servo event 9A. 

Fault Isolation/Correction: 

1. Install RA90X-O001 FCO 

25 Servo Fault Error 

Error Type: DE 

Error Description: A servo error was detected but no condition was found that would cause 
the error condition. The master processor, while in its idle loop, was scanning the servo GASP 
gate array and discovered error bit(s) set. Valid conditions include: 

Actuator fault 

PLO error 

Actuator over current error 
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Actuator over speed error 

Track counter error 

Off track error 

Guardband error 

Heat sink 1 error— over-temperature 

Heat sink 2 error— over-temperature 

Fault Isolation/Correction: 

1. ECM 

26 Spindle Speed Error (Servo Processor) 

Error Type: DE 

Error Description: Spindle is not within ± 0.5% of 3600 r/min. The servo processor monitors 
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error. Upon detection of the loss of PLO, the master processor examines the servo processor 
status to determine if it has valid servo-detected error information. If it does, this error is 
asserted. 

Fault Isolation/Correction: 

1. ECM 

2. Brake 

3. HDA 

27 Servo Over-Temperature Error at Si 

Error Type: DE 

Error Description: The thermal sensor (SI) on the servo module detected an over- 
temperature condition. This results in the master processor spinning the disks down and 
setting this error condition. If the over-temperature clears, the controller can initialize the 
drive and try to spin it back up. 

Fault Isolation/Correction: 

1. Ambient air temperature too high 

2. Cabinet door air vent needs cleaning 

3. Blower assembly 

4. ECM 

28 Servo Over-Temperature Error at S2 

Error Type: DE 

Error Description: The thermal sensor (S2) on the ECM detected an over-temperature 
condition. This results in the master processor spinning the disks down and setting this error 
condition. If the over-temperature clears, the controller can initialize the drive and try to spin 
it back up. 

Fault Isolation/Correction: 

1. Ambient air temperature too high 

2. Cabinet door air vent needs cleaning 

3. Blower assembly 

4. ECM 
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29 SDI Invalid Error Recovery Level Specified 

Error Type: PE 

Error Description: The controller issued an SDI ERROR RECOVERY command with an 
illegal recovery level. The RA90/RA92 supports 14 error rec overy m echanisms. This value 
is passed to the controller during a GET COMMON CHARACTERISTICS command. The 
controller in this case asked for a level greater than 14. 

Not all controllers report the error recovery levels in the same manner. 

Fault Isolation/Correction: 

1. Controller 

2. ECM 

2A SDI Invalid Subunit Specified 

Error Type: PE 

Error Description: The controller attempted a GET STATUS command to a subunit address 
other than zero. (The RA90/RA92 is a single unit drive with a subunit address of zero.) 

Fault Isolation/Correction: 

1. Controller 

2. ECM 

2B SDI Invalid Diagnose Memory Region Location 

Error Type: PE 

Error Description: The controller or the operator attempted to execute a nonexistent internal 
drive test or internal diagnostics while the drive was on line to the controller. 

Fault Isolation/Correction: 

1. Use valid diagnostic 

2. Controller 

3. ECM 

2C SDI Spindle Not Ready with Seek/Recallbration Command 

Error Type: PE 

Error Description: A RECALIBRATE or SEEK command was issued to a spun-down disk 
drive. 

Fault Isolation/Correction: 

1. Controller 

2. ECM 
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2D Power Supply Over-Temperature 

Error Type: DE 

Error Description: A critical over-temperature condition exists in the power supply. This 
condition is detected by the master processor through the OVER TEMP L signaL Within 15 ms 
of detection, the dc voltages are removed in an orderly fashion. The error is stored in EEPROM 
and can be read when power is restored to the drive after the over-temperature condition is 
corrected or the power supply cools down sufficiently to allow power to be reapplied. 

Fault Isolation/Correction: 

1. Ambient air temperature too high 

2. Blower assembly 

3. Power supply 

4. ECM 

5. Rear flex cable assembly 

2E SDI Spinup Inhibited by Controller Flags 

Error Type: PE 

Error Description: The drive cannot be spun up from the OCP while the drive is in the 
AVAILABLE or ONLINE state to the controller. 

NOTc 

If the Run switch is selected prior to the Fault switch, a 2E led code will be indicated. 

Fault Isolation/Correction: 

1. Check Run switch 

2. ECM 

2F SDI RUN Command with Run Switch in Stop Position 

Error Type: PE 

Error Description: An SDI RUN command was issued to the drive when the OCP Run switch 
was in a logical stop state. 

Fault Isolation/Correction: 

1. Check OCP switch state 

2. Controller 

3. ECM 

30 Write Current and No Write Gate 

Error Type: DE 

Error Description: Current was detected at the read/write heads and WRITE GATE was not 
asserted. The PCM provides the current source for the write chips in the HDA. Drive firmware 
tests for this condition during diagnostics. 
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Fault Isolation/Correction: 

1. ECM 

2. PCM 

3. HDA 

31 Read Gate and Write Gate Both Asserted 

Error Type: DE 

Error Description: SDI gate array detected that BEAD GATE and WRITE GATE were 
asserted at the same time. 

Fault Isolation/Correction: 

1. ECM 

2. Controller 

32 Read or Write While Faulted 

Error Type: DE 

Error Description: A READ or WRITE command was issued to a drive that has a fault 
condition. 

Fault Isolation/Correction: 

1. Check error log for fault condition 

2. Controller 

3. ECM 

33 Attempt to Write Through Bursts 

Error Type: DE 

Error Description: An attempt was made to assert WRITE GATE while the read/write heads 
were positioned over embedded servo burst information. 

Fault Isolation/Correction: 

1. ECM 

2. Controller 

3. HDA 

34 ENDEC Encoder Error 

Error Type: DE 

Error Description: Data to be written to media has been improperly 2/3 encoded. 

Fault Isolation/Correction: 

1. ECM 

2. PCM 
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35 Write and Write Unsafe 

Error Type: DE 

Error Description: A problem in the write data path prevented the drive from correctly 
writing data to the disk surface. One or more of the following conditions cause tins error: 

* No write data transitions 

* No wnt© current 

• No SSI 283 (head select chip) selected 

• SSI 283 stuck in read mode 

The unsafe conditions are wire ORed together and are detected on the PCM. 
Fault Isolation/Correction: 

1. PCM 

2. HDA 

3. ECM 

36 Write and Servo Uncalibrated 

Error Type: DE 

Error Description: The firmware routines used to calibrate the read/write heads and the 
servo system failed to complete successfully The subsequent write was attempted with the 
servo uncalibrated. 

Fault Isolation/Correction: 

1. PCM 

2. ECM 

3. HDA 

37 Write Gate and No Write Current 

Error TVpe: DE 

Error Description: WRITE GATE was asserted but no writ e current was detected at the 
read/write heads. The PCM sources the current when WRITE GATE is asserted. 

Fault Isolation/Correction: 

1. PCM 

2. ECM 

3. HDA 

38 Read Gate and Multiple Head Chips Selected 

Error Type: DE 

Error Description: During a read operation, the master processor determined that mere than 
one head and/or more than one SSI 283 chip was selected. 
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Fault Isolation/Correction: 

1. PCM 

2. ECM 

3. HDA 

39 Write Gate and Off Track 

Error Type: DE 

Error Description: A loss of fine track was detected when WRITE GATE was asserted. This 
error code is not implemented in microcode revisions 7 and later. This error code is used 
exclusively with the dedicated-only servo system found on earlier drives. Refer to error code 9B. 

Fault Isolation/Correction: 

1. Install RA90X-O001 unless superseded by a later FCO. 

3A Write Gate and Wrlte-Protected 
Error Type: WE 

Error Description: A write-protected drive detected the assertion of WRITE GATE. 
Fault Isolation/Correction: 

1. Controller 

2. ECM 

3B Hard INIT Occurred to Drive 

Error Type: DE 

Error Description: This is not typically an error condition. It is a record of initializations 
(initializations the controller started by the RTCS logical signal INIT, and initializations started 
by the drive). Initializations stop mechanical movements, and the drive performs a power-up 
initialize and reloads the servo processor code. 

Examine previous error conditions. 

With drive microcode revisions 10 or earlier, if the drive performs a hard initialization on its 
own (for example, when new drive microcode has just been reloaded), this error entry will be 
recorded into the EEPROM. 

Microcode revisions later than 10 give a new indication of microcode reload. Refer to drive LED 
code 11. 

Fault Isolation/Correction: 

1. Look for previous errors 

2. ECM 

3. Controller 

3D HDA Read/Write Interlock Broken 
Error Type: DE 
Error Description: The cable between the PCM and the ECM is disconnected or broken. 



DIGITAL INTERNAL USE ONLY 



Troubleshooting and Error Codes 5-73 

Fault Isolation/Correction: 

1. Disconnected ECM-to-PCM cable 

2. Bad ECM-to-PCM cable 

3. PCM 

4. ECM 

3E OCP Interlock Broken 

Error Type: DE 

Error Descriptions The operator control panel was removed with dc voltages still applied to 
the drive. 

Fault Isolation/Correction: 

1. OCP flex circuit connectors 

2. Bezel/blower flex circuit/servo module connectors 

3. Servo module/ECM connectors 

4. OCP 

5. ECM 

40 SDI Invalid Read Memory Region Error 

Error Type: PE 

Error Description: The controller issued an SDI level 2 READ MEMORY REGION command 
to an invalid region of drive read memory. 

Fault Isolation/Correction: 

1. Operator attempted to write or read a nonexistent or protected memory location. 

2. Controller 

3. ECM 

42 Drive Not On Une/SEEK Command Issued 

Error Type: PE 

Error Description: The controller issued an SDI level 2 INITIATE SEEK command and the 
drive was not on line to the controller. 

Fault Isolation/Correction: 

1. Controller 

2. SDI cable 

3. ECM 
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43 TCR and Not Read/Write Ready Fault 

Error Type: RE 

Error Description: The SDI gate array has decoded a data transfer command from the 
controller, but the drive is not ready to read/write; or the drive detected a loss of READ/WRITE 
READY during a data transfer. 

Fault Isolation/Correction: 

1. Controller 

2. SDI cable (poor SDI connection) 

3. ECM 

44 Format Command and Format Not Enabled 

Error Type: RE 

Error Description: (A FORMAT ON SECTOR OR INDEX command or a SELECT TRACK 
AND FORMAT ON INDEX command was decoded by the drive without the format bit (FO) 
being set in the drive.) 

Fault Isolation/Correction: 

1. Controller 

2. ECM 

45 Read Gate and Off Track Both Asserted 

Error Type: DE 

Error Description: A loss of fine track was detected when read gate was asserted. This error 
code is not implemented in microcode revisions 7 and later. This error code is used exclusively 
with the dedicated-only servo system found on earlier drives. Refer to error code 9B. 

Fault Isolation/Correction: 

1. Install RA90X-O001 unless superseded by a later FCO. 

46 Invalid Hardware Fault 

Error Type: DE 

Error Description: A failure was detected for unused fault inputs to the SDI gate array. 

Fault Isolation/Correction: 

k l. ECM 

47 Invalid Disconnect Command/TT Bit Error 

Error Type: PE 

Error Description: An SDI DISCONNECT command was issued to the drive and the TT 
modifier bit was in an incorrect state. 

Fault Isolation/Correction: 

1. Controller 

2. ECM 
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48 Invalid Write Memory Byte Counter/Offset Error 

Error Type: PE 

Error Description: The drive detected an incorrect number of data bytes to be written in 
drive memory; or the directed offset into the memory region was incorrect. 

Fault Isolation/Correction: 

1. Controller 

2. ECM 

49 Invalid Command During TOPOLOGY Command 

Error Type: PE 

Error Description: During the execution of an SDI level 2 TOPOLOGY command, the drive 
received an illegal SDI level 2 command from another controller. 

Fault Isolation/Correction: 

1. Controller 

2. ECM 

4A Drive Disabled by Controller (DD Bit Set) 

Error Type: Informational Only 

Error Description; The controller issued an SDI level 2 CHANGE MODE command to a drive 
with its DD bit asserted. When the controller asserts the DD bit, it disables the drive from 
further I/O activity. 

Fault Isolation/Correction: 

1. Controller (controller error routine determined the drive should be taken out of service) 

2. ECM 

4B Index Error 

Error Type: DE 

Error Description: No index pulse was detected for one revolution of the disk. 

Fault Isolation/Correction: 

1. ECM 

2. HDA 

4C SDI Invalid Write Memory Region Error 

Error Type: PE 

Error Description: An SDI level 2 command was issued to a drive-denned invalid memory 
region. 

Fault Isolation/Correction: 

1. Operator (attempting to write a nonexistent or protected memory location in drive) 

2. Controller 

3. ECM 
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4D Write Gate and Bad Embedded Servo Information 

Error Type: DE 

Error Description: The servo processor discovered incorrect embedded servo information 
while WRITE GATE was asserted. 

Fault Isolation/Correction: 

1. HDA 

2. PCM 

3. ECM 

4F Invalid Select Group (Level 1 Command) - Not Read/Write Ready 

Error T^pe: RE 

Error Description: The controller issued a level 1 SELECT GROUP command to a drive when 
the drive was not read/write ready. 

Fault Isolation/Correction: 

1. Check OCP for drive state 

2. Controller 

3. ECM 

50 Servo Data Bus Failure 

Error Type: DF 

Error Description: A communication path to the GASP array failed during resident diagnostic 
testing. 

Fault Isolation/Correction: 

1. ECM 

51 Sector/Byte Counter Error 

Error Type: DF 

Error Description: A resident diagnostic failure occurred during testing of the sector counter 
register or byte counter register. 

Fault Isolation/Correction: 

1. ECM 

52 Servo RAM Test Failure (Low Byte of Address) 

Error Type: DF 

Error Description: At power-up, the drive-resident diagnostics failed during testing of RAM 
located on the servo portion of the ECM. 

Fault Isolation/Correction: 

1. ECM 
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53 Servo Processor Offset Error 

Error Type: DE 

Error Description: The servo system failed to offset the heads during error recovery. 

Fault Isolation/Correction: 

1. ECM 

54 Head Select Register Loopback Error 

Error Type: DF 

Error Description: A drive-resident diagnostic detected a failure in the head select register. 
The head select register is inside the SDI gate array. 

Fault Isolation/Correction: 

1. ECM 

55 DSP Sanity Timeout After Load 

Error Type: DE 

Error Description: The servo processor microcode was reloaded from the EEPROM on the 
I/O-R/W module because of a fault condition. After the microcode was reloaded in servo RAM, 
the master processor initiated a servo sanity test. The sanity test timed out, indicating a 
problem with the servo processor. 

Fault Isolation/Correction: 

1. ECM 

56 Servo RAM Test Failure (High Byte of Address) 

Error Type: DF 

Error Description: A drive-resident diagnostic failed when testing RAM that resides on the 
servo module. 

Fault Isolation/Correction: 

1. ECM 

57 Master Processor Timer Failure 

Error Type: DF 

Error Description: A drive-resident diagnostic failed when testing the time count register or 
output compare register. Both are located internal to the master processor. 

Fault Isolation/Correction: 

1. ECM 

58 Dedicated Head Gain Calibration Error 

Error Type: DE 

Error Description: The servo processor timed out while attempting to measure and 
compensate for the gain of the dedicated servo head. 
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Fault Isolation/Correction: 

1. ECM 

2. PCM 

3. HDA 

59 Embedded Servo Offset Calibration Error 

Error T^pe: DE 

Error Description: The servo processor timed out during a calibrate of the read/write head 
offsets. This calibration occurs during all head loads and periodically thereafter. 

Fault Isolation/Correction: 

1. HDA (most probable, especially if only one head is involved) 

2. ECM (10 of 13 heads affected) 

3. PCM 

5A Embedded Head Gain Calibration 

Error Type: DE 

Error Description: The servo processor timed out while attempting to calibrate the head gain 
relative to the read/write head embedded burst information. The drive calculates this gain for 
each of the read/write heads. 

Fault Isolation/Correction: 

1. ECM (if most heads show problem) 

2. PCM (if most heads show problem) 

3. HDA (most probable, especially if only 1 head is involved) 

5B Bias Calibration Error 

Error lype: DE 

Error Description: The servo processor timed out during a bias force adjustment to the 
actuator. 

Fault Isolation/Correction: 

1. ECM 

2. PCM 

3. HDA 

5C Incorrect Diagnostic Index or Sector Pulse 

Error Type: DF 

Error Description: In testing the sector and byte counters, the master processor detected that 
the sector counter was not working properly. 

Fault Isolation/Correction: 

1. ECM 
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60 Read/Write Head Select Failure 

Error Type: DE 

Error Description: A failure occurred when attempting to select a group (head). When a 
aw»rjr> selection is reouested logic and firmware in the drive verify that the correct SSI 2S3 
chip and head in the HDA have been selected. This verification takes place during functional 
operation and in diagnostic mode. 

Fault Isolation/Correction: 

1. PCM 

2. PCM-to-ECM cable 

3. ECM 

61 Diagnostic Index Sync Timeout Error 

Error Type: DF 

Error Description: A drive-resident diagnostic failed to detect an index pulse. 

Fault Isolation/Correction: 

1. ECM 

2. HDA 

62 Read Test Overall Read Failure (Three or More Bad Heads) 

Error Type: DF 

Error Description: During the execution of a resident diagnostic read-only test or write/read 
test, data by three or more heads read from diagnostic cylinders did not compare to the 
originally written patterns. 

The RA90 drive has two diagnostic cylinders (2659 and 2660) located in the inner guardband 
area of the media. Only the drive can access these two cylinders; they cannot be accessed by the 
controller. These are not the same cylinders used by the controller to execute controller-based 
diagnostics (DBN space). Refer to drive-resident diagnostic 17. 

.». €»»■■» wviauvxv wv*. * Cvvivui 

1. Reformat the read-only cylinder by running drive-resident diagnostic 17 

2. PCM 

3. ECM 

4. HDA 

63 Read Test Partial Failure (One or Two Bad Heads) 

Error Type: DF 

Error Description: During the execution of a resident diagnostic read-only test or write/read 
test, data by one or two heads read from diagnostic cylinders did not compare to the originally 
written patterns. Refer to error code 62. 
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Fault Isolation/Correction: 

1. Reformat the read-only cylinder by running drive-resident diagnostic 17 

2. PCM 

3. ECM 

4. HDA 

64 Cannot Clear IID Error Bits 

Error Type: DF 

Error Description: Error detection logic internal to the IID gate array cannot be cleared. 

Fault Isolation/Correction: 

1. ECM 

65 Diagnostic Index or Sector Not Detected 

Error Type: DF 

Error Description: No index pulse was detected during the execution of resident diagnostics 
that read or write media. 

Fault Isolation/Correction: 

1. ECM 

2. HDA 

66 Read Test Servo Failure 

Error Type: DF 

Error Description: The drive internal diagnostic read or write/read test failed because of an 
off-track condition. 

Fault Isolation/Correction: 

1. PCM 

2. ECM 

3. HDA 

67 Cannot Execute Write Test (Read-Only Test Failed or Not Run First) 

Error Type: DF 

Error Description: This indicates an operator error, not a drive problem. 

Service personnel must run the read-only test before attempting to run the write test. 
Additionally, the read test must be successful before the write/read diagnostic is executed. 

Fault Isolation/Correction: 

1. Service personnel attempted to execute the write/read diagnostic without first executing the 
read-only diagnostic. 

2. The read-only diagnostic failed and the write/read diagnostic was attempted anyway. 

3. ECM 
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68 This Diagnostic Cannot Execute Without Software Jumper 

Error Type: DF 

Error Description: A diagnostic or utility was attempted without having first selected the 
Run/Stop switch. The Run/Stop switch must he selected within 1.5 seconds after initiating 
certain tests with the Write Protect switch. 

Fault Isolation/Correction: 

1. Procedural error 

2. ECM 

3. OCP 

69 Unable to Force Compare Error 

Error Type: DF 

Error Description: The drive failed to force a data compare error during a read-only 
diagnostic. 

Fault Isolation/Correction: 

1. ECM 

6A Unable to Force No-Sync Error 
Error Type: DF 

Error Description: The diagnostic firmware was unable to force a no-sync error. 
Fault Isolation/Correction: 

1. ECM 

6B R/W Write/Read Test Overall Failure (Three or More Bad Heads) 

Error Type: DF 

Error Description: The data read from three or more heads during execution of resident 
diagnostics was incorrect, The heads are positioned at the drive-reserved diagnostic cylinders 
during these tests. 

Fault Isolation/Correction: 

1. ECM 

2. PCM 

3. HDA 

6C R/W Write/Read Test Partial Failure (One or TWo Bad Heads) 

Error Type: DF 

Error Description: The data read from one or two heads was incorrect. The heads were 
positioned at the drive reserved diagnostic cylinders. 

Fault Isolation/Correction: 

1. ECM 

2. PCM 

3. HDA 
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6D Unable to Force Read Gate and Write Gate Together 

Error Type: DF 

Error Description: Drive-resident diagnostics were unable to force the simultaneous assertion 
of READ GATE and WRITE GATE. 

Fault Isolation/Correction: 

1. ECM 

6E Unable to Force Write Gate and Write Protect Error 
Error Type: DF 

Error Description: A write-protected drive has WRITE GATE asserted but no error was 

detected. 

Fault Isolation/Correction: 

1. ECM 

6F Diagnostic Write Attempted While Write-Protected 

Error Type: DF 

Error Description: Either the Write/Read Diagnostic or the Diagnostic Track Format Utility 
was attempted on a write-protected drive. 

Fault Isolation/Correction: 

1. Drive write-protected from the OCP 

2. Drive write-protected by the controller 

3. ECM 

70 Servo Processor Splnup Timeout 

Error Type: DE 

Error Description: The master processor timed out after issuing a SPINUP command to the 
servo processor. 

Fault Isolation/Correction: 

1. ECM 

71 Recalibrate Timeout Error 

Error Type: DE 

Error Description: The master processor timed out during a RECALIBRATE command issued 
to the servo processor. 

Fault Isolation/Correction: 

1. ECM 
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72 Servo Processor Seek Timeout 

Error T^pe: DE 

Error Description: The servo processor timed out the execution of a SEEK command. This is 
a gross seek error in that the servo subsystem never sensed that it got even wi thin a cylinder of 
the desired cylinder within a 100 ms. 

Fault Isolation/Correction: 

1. ECM 

2. HDA 

73 Servo Processor Head Switch Timeout 

Error Type: DE 

Error Description: The master processor timed out before the servo processor responded to a 
head switch status request. 

Fault Isolation/Correction: 

1. ECM 

74 Offset Timeout Error 

Error Type: DE 

Error Description: The master processor timed out during an offset check or OFFSET 
command to the servo processor. 

Fault Isolation/Correction: 

1. ECM 

2. HDA 

75 Servo Processor Unload Timeout 

Error Type: DE 

Error Description: The master processor timed out after issuing an UNLOAD (head) 
command to the servo processor. 

Fault Isolation/Correction: 

1. ECM 

76 Servo Processor Sanity Timeout 

Error Type: DE 

Error Description: The master proce ssor timed out while waiting for a response from the 
servo processor after issuing a SANITY CHECK command. 

Fault Isolation/Correction: 

1. ECM 
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77 Head Load Timeout Error 

Error Type: DE 

Error Description: The master processor timed out waiting for a response from the servo 
processor after issuing a HEAD LOAD command. 

Fault Isolation/Correction: 

1. ECM 

78 Servo Processor Bias Force Calibration Timeout 

Error Type: DE 

Error Description: The master processor issued a BIAS CALIBRATION command (diagnostic 
opcode) to the servo processor. The master processor timed out while waiting for a servo 
processor response. 

Fault Isolation/Correction: 

1. ECM 

79 Dedicated Servo Calibration Timeout Error 

Error Type: DE 

Error Description: The master processor timed out waiting for the servo processor to respond 
to a DEDICATED SERVO CALIBRATION command issued as part of a diagnostic opcode. 

Fault Isolation/Correction: 

1. ECM 

7A Embedded Offset/Gain Calibration Timeout 

Error Type: DE 

Error Description: The master processor timed out while waiting for the servo processor 
to respond to an EMBEDDED OFFSET CALIBRATION or EMBEDDED HEAD GAIN 
CALIBRATION command issued by a diagnostic opcode. 

Fault Isolation/Correction: 

1. ECM 

7B Invalid Test While Spindle Running 

Error Type: DF 

Error Description: The drive was spun up and the operator selected a diagnostic that can 
only he run when the drive is spun down. (Certain diagnostics can only be executed on a 
spun-down drive.) Refer to Chapter 4 for a complete listing of diagnostics and execution 
requirements. 

Fault Isolation/Correction: 

1. Spin down drive to run selected tests 

2. ECM 
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7C Gray Code Match Error After Settling 
Error Type: DE 
Error Description: Head settling on a track normally occurs following a SEEK command. 
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this case, the servo was settling within 1/4 track of the desired track (but fine track had not 
been asserted) when suddenly the servo gray coded information indicated that movement of 
>1 cylinder has taken place away from the desired target cylinder. Such an occurrence may 
be related to an intermittent open of the coil actuator circuitry or transient spike in voltage 
establishing the holding current for the positioner. 

Fault Isolation/Correction: 

i. ECM 
2. HDA 

7D Embedded Interrupt Timeout 

Error Type: DE 

Error Description: The servo processor failed to detect a BURST PROTECT transition 
(asserted to de-asserted state) as generated from the master processor (ECM). 

1. ECM 

7E Fine Track Lest After Settling 

Error Type: DE 

Error Description: The actuator initially settled on track but has now moved off track and 
loss of fine track has been declared by the servo subsystem This condition has persisted for 2 
seconds. 

Examine head and or cylinder correlation when considering this error. This information should 
be derivable from the host error log or by doing a complete dump of the drive internal error log 
with a controller. 

Other contributors to this condition might be sustained vibration to the drive unit, HDA 
runnout condition, or an HDA mechanical resonance problem. 

Fault Isolation/Correction: 

1. ECM (if totally random cylinders and heads) 

2. HDA (first choice if same cylinders)) 

3. HDA (first choice if same head(s)) 

7F Servo Settling Timer Expired 

Error Type: DE 

Error Description: The actuator was not able to settle on track within the allotted settling 
timeout period. The servo system was able to relocate to within 1/4 track of the desired 
track/cylinder, however, it could not meet the fine track threshold stability criteria within 
the time allotted (1.8 seconds). 

Examine head and or cylinder correlation when considering this error. 
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Fault Isolation/Correction: 

1. ECM (if totally random cylinders and heads) 

2. HDA (first choice if same cylinder(s)) 

3. HDA (first choice if same head(s)) 

80 Master Processor ROM Consistency Code Msmatch 

Error Type: DF 

Error Description; The master processor microcode is inconsistent with the microcode stored 
in EPROM. 

Fault Isolation/Correction: 

1. Reload microcode 

2. ECM 

81 Servo Processor Settle State Timeout 

Error Type: DE 

Error Description: The actuator was not able to settle on track within the allotted settling 
timeout period. 

Fault Isolation/Correction: 

1. ECM 

2. PCM 

3. HDA 

82 Servo Processor Coarse Velocity State Timeout 

Error Type: DE 

Error Description: The servo processor timed out when commanded to move the actuator 256 
or more cylinders. 

Fault Isolation/Correction: 

1. ECM 

2. HDA 

83 Servo Processor Fine Velocity State Timeout 

v Error Type: DE 

Error Description: The servo processor timed out when commanded to move the actuator less 
then 256 cylinders. 

Fault Isolation/Correction: 

1. ECM 

2. PCM 

3. HDA 
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84 Servo Processor Seek Direction Error 

Error Type: DE 

Error Description: Servo processor actuator (positioner) and dedicated servo information 
indicated that the seek direction was wrong. 

Fault Isolation/Correction: 

1. ECM 

2. HDA 

85 Master Processor RAM Test Failure 

Error Type: DF 

Error Description: The drive-resident diagnostics detected bad RAM internal to the master 
processor. 

Fault Isolation/Correction: 

1. ECM 

86 Static RAM Failure 

Error Type: DF 

Error Description: Drive-resident diagnostics detected bad RAM external to the master 
processor. 

Fault Isolation/Correction: 

1. ECM 

87 Master Processor ROM Checksum Failure 

Error Type: DF 

Error Description: Drive-resident diagnostics detected bad ROM internal to the master 
processor. 

Fault Isolation/Correction* 

1. ECM 

88 Master Processor EEPROM Writs Violation Error 

Error Type: DE 
> Error Description: EEPROM was addressed and written to while in read-only mode. 
Fault Isolation/Correction: 

1. ECM 

89 Seek Speed Out of Range 

Error Type: DE 

Error Description: While monitoring the speed of the actuator, the servo processor 
determined that seek velocity is beyond prescribed speed. 
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Fault Isolation/Correction: 

1. ECM 

2. Power supply 

3. HDA 

8A Servo Processor Inside of Destination Track During Settle State 

Error Type: DE 

Error Description: Servo processor has determined that the positioner has placed heads 
inside of the destination track during settle state. 

Fault Isolation/Correction: 

1. ECM 

8B Gray Code Error After Settling With Fine Track 

Error Type: DE 

Error Description: Head settling on a track normally occurs following a SEEK command. 
A gray code comparison is made to ensure the heads have settled on the requested track. 
In this case, the servo was settling within 1/4 track of the desired track and fine track had 
been asserted when suddenly the servo gray coded information indicated that movement of 
>1 cylinder has taken place away from the desired target cylinder. Such an occurrence may 
be related to a significant amount of vibration in the vertical axis of the drive, or electrical 
transients from the positioner control voltage and holding current circuitry. 

Fault Isolation/Correction: 

1. ECM 

2. HDA 

8C Uncallbrated and PLO Error 
Error Type: DE 

Error Description: A PLO error occurred and the head offsets were uncalibrated. 
Fault Isolation/Correction: 

1. ECM 

2. PCM 

3. HDA 

8D Polarity Error on Velocity Command During a Multi-Track Seek 

Error Type: DE 

Error Description: The polarity indication bit in a velocity command profile was clear (zero) 
during a multi-track seek. This bit should have been set. (This is one of the setup functions the 
servo processor checks before it executes the digital servo seek profiles.) 

Fault Isolation/Correction: 

1. ECM 
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8E Master Processor ROM/EEPROM Consistency Code Mismatch 
Error Type: DF 

Error Description: Master processor microcode is incompatible with EEPROM microcode. 
Fault Isolation/Correction: 

1. Reload microcode 

2. ECM 

8F EEPROM Checksum Failure 

Error T^pe: DF 

Error Description: Drive-resident diagnostics detected bad EEPROM external to the master 
processor. The calculated checksum did not match the stored checksum. 

Fault Isolation/Correction: 

1. ECM 

90 Unable to Force Index Error 

Error Type: DF 

Error Description: Drive-resident diagnostics were unable to force and/or detect an index 
error. 

Fault Isolation/Correction: 

1. ECM 

91 No Interrupt Detected During R/W Force Fault 

Error Type: DF 

Error Description: No interrupt to the master processor was detected by the drive during the 
read/write force fault diagnostic. 

Fault Isolation/Correction: 

1. ECM 

92 Inner Guardband Without a Servo Fault Set 

Error Type: DF 

Error Description: The actuator was positioned in the inner guardband area and the inner 
guardband flag was set; however, a servo fault condition was not detected. 

Fault Isolation/Correction: 

1. ECM 

93 Inner Guardband/Servo Fault: No Interrupt Detected 

Error Type: DF 

Error Description: The actuator was positioned at a cylinder in the inner guardband area, 
the inner guardband flag was set, and a servo fault error was detected, but the master processor 
was not interrupted. 

Fault Isolation/Correction: 

1. ECM 
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94 SDI Loopback Test Failure on Both Ports 

Error Type: DF 

Error Description: Drive-resident diagnostics detected an SDI gate array or TSID gate array 
failure involving both SDI ports A and B logic. If the drive internal test 09 fails, the failure 
could be in the hardware external to the SDI/TSID gate array as well. During internal T09, the 
testing expects SDI loopback connectors to be attached to the ECM or at the cab bulkhead. 

Fault Isolation/Correction: 

If test number 08 fails: 

1. ECM 

If test number 09 fails: 

1. Loopback connectors are not installed 

2. Defective SDI cable 

3. Defective bulkhead connector 

4. ECM 

5. SDI connectors J101 or J102 

95 SDI Test Failure: Port A 

Error Type: DF 

Error Description: A drive-resident diagnostic detected a failure with the SDI gate array or 
the TSID gate array involving SDI Port A. If the drive internal test 09 fails for this error code, 
the failure could be in the SDI Port A hardware external to the SDI/TSID gate array as well. 
During internal T09, the testing expects SDI loopback connectors to be attached to the ECM or 
at the cabinet SDI bulkhead. 

Fault Isolation/Correction: 

If test number 08 fails: 

1. ECM 

If test number 09 fails: 

1. Port A loopback connectors are not installed 

2. Defective SDI cable (Port A) 

3. Defective bulkhead connector (Port A) 

4. ECM 

5. SDI connector J102 (Port A) 

96 SDI Failure: Port B 

Error Type: DF 

Error Description: A drive-resident diagnostic detected a failure with the SDI gate array or 
the TSID gate array involving SDI Port B. If the drive internal test 09 fails for this error code, 
the failure could be in the SDI Port B hardware external to the SDI/TSID gate array as well. 
During internal T09, the testing expects SDI loopback connectors to be attached to the ECM or 
at the cabinet SDI bulkhead. 
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Fault Isolation/Correction: 

If test number OS fails: 

1. ECM 

If test number 09 fails: 

1. Port B loopback connectors are not installed 

2. Defective SDI cable (Port B) 

3. Defective bulkhead connector (Port B) 

4. ECM 

5. SDI connector J101 (Port B) 

98 Can't Execute Diagnostic/Jumper 

Error Type: DF 

Error Description: A diagnostic test could not be run because a hardware jumper was not 
installed. If this error is seen in the field, do not attempt to alter jumpers. 

Fault Isolation/Correction: 

1. Operator (do not attempt to alter jumpers) 

9A Positioner Corrected Event During Data Transfer 

This is typically an event unless analyzed by YAXsimPLUS to be worthy of correction. 
Reference expanded discussion of 9A, 9B, and 9C events under error code 9C. 

Error Type: DE 

Error Description: Heads were not fine positioned or locked on track (relative to the 
embedded servo information) at the time a read or write operation was ready to start. The 
drive took necessary procedures to re-establish on-track condition. The drive command was 
received but READ GATE or WRITE GATE had not yet been asserted. 

Fault Isolation/Correction: 

1. HDA (if only one head) 

2. ECM (if 10 of 13 heads) 

9B Write and Positioner Corrected Event 

This is typically an event unless analyzed by 'VAXsimPLUS to be worthy of correction. 
Reference expanded discussion of 9A, 9B, and 9C events under error code 9C. 

Error Type: DE 

Error Description: The master processor determined that the selected read/write head moved 
off track when WRITE GATE was asserted. The condition was corrected. (The read/write heads 
must be within 57.1 microinches from track centerline.) 

Fault Isolation/Correction: 

1. HDA (if only one head) 

2. ECM (if 10 of 13 heads) 
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9C Read Gate and Positioner Corrected Event 

Error Type: DE 

Error Description: The master processor determined that the selected read/write head moved 
off track when READ GATE was asserted. The condition was corrected. (The read/write heads 
must be within 57.1 microinches from track centerline.) 

TROUBLESHOOTING 9A, 9B, AND 90: 

This is typically an event unless analyzed by VAXsimPLUS to be worthy of correction. 
For HSC/KDM controllers, event rates of <5 per day may be considered normal for 
disks that operate with fairly high I/O rates (continually or in significant bursts) 
provided that the following pattern is noted: 

• Ninety percent of occurrences are with the top five heads (heads through 4). 

• One of the top five heads will have few if any errors. 

• No one head in the top five has 90 percent of the errors. (This might point to a 
track/surface problem.) 

If the event pattern matches this, and the event rate exceeds these guidelines, then 
HDA replacement may be necessary. 

If the event pattern does not match this, then further analysis is required. 

For KDA/KDB/UDA controllers, event rates should not exceed 16 per day on heavily used disks 
(I/O rates of 30 per second). 

If these events occur over 10 of the 13 heads, then the occurrence may be related to a general 
servo/read path problem. This is possibly an electronics problem that may not involve the HDA. 

If these errors occur primarily on one head, there is strong head/surface correlation and possible 
HDA replacement is warranted. 

The above number of events to be expected was determined by analysis and experience with the 
RA90 HDA 70-22951-01. With the introduction of the RA92 (HDA 70-27492-01), the number 
of 9A, 9B, and 9C events has decreased significantly. The phase-in of the RA92 HDA hardware 
mechanics (resulting in an RA90-compatible HDA 70-27268-01) into RA90 production has 
substantially reduced the occurrence of these events because of the new design. 

Fault Isolation/Correction: 

1. HDA (if only one head) 

2. ECM (if 10 of 13 heads) 

9D Error Log Header Corrupted 

Error Type: DF 

Error Description: A location in EEPROM containing drive-resident error log identifier 
information, device type, or descriptor size is invalid. 

Fault Isolation/Correction: 

1. Attempt to load new microcode 

2. ECM 
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9E Drive Faulted, Test Cannot Run 
Error Type: DF 

Error Description: Drive-resident diagnostics cannot run while the drive is faulted. 
Fault Isolation/Correction: 

1. Check fault condition 

2. ECM 

9F Error Log Cheek Point Code 

Error T^pe? Informational Only* 

Error Description: If drive-resident diagnostic T50 has been used to place a checkpoint 
between errors in the drive internal error log, a 9F entry will be seen in the drive internal error 
log. This makes drive troubleshooting easier by placing a null field between errors in the drive 
internal error log to partition repair activity. 

Fault Isolation/Correction: 

1. None (read Error Description above) 

AO Unable to Clear SDI Array Safety Status Register 

Error Type: DF 

Error Description: Drive-resident diagnostics attempted to clear the SDI gate array safety 
status registers but were unsuccessful. 

Fault Isolation/Correction: 

1. Tb isolate the stuck bit, check the preceding error in the drive internal error log storage silo. 
Base corrective action on the preceding error. 

2. ECM 

A1 Unable to Force Encoder Error 

Error Type: DF 

Error Description: Drive-resident diagnostics were unable to force a read/write 
encoder/decoder (RWENDEC) error. 

Fault Isolation/Correction: 

1. ECM 

A2 Unable to Force Multiple Head Select While Reading 

Error Type: DF 

Error Description: Drive-resident diagnostics were unable to force read gate and multi-chips 
error. 

Fault Isolation/Correction: 

1. PCM 

2. ECM 

3. HDA 
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A3 Unable to Force Write Gate and Write Unsafe 

Error Type: DF 

Error Description: A drive-resident diagnostic was unable to force write gate and write 
unsafe error conditions. 

Fault Isolation/Correction: 

1. ECM 

2. PCM 

A4 Unable to Force Write Current and No Write Gate 

Error Type: DF 

Error Description: Drive-resident diagnostics were unable to force write current and no write 
gate error conditions and detect such a condition. 

Fault Isolation/Correction: 

1. ECM 

2. PCM 

A5 Unable to Force Write Gate and No Write Current 

Error Type: DF 

Error Description: Drive-resident diagnostics were unable to force write gate and no write 
current error conditions. 

Fault Isolation/Correction: 

1. ECM 

2. PCM 

A6 Unable to Force Read Gate and Off Track Error 

Error Type: DF 

Error Description: Drive-resident diagnostics were unable to force read gate and off track 
error conditions. 

Fault Isolation/Correction: 

1. ECM 

A7 Unable to Force Write Gate and Off Track Error 

Error Type: DF 

Error Description: Drive-resident diagnostics were unable to force write gate and off track 
error conditions. 

Fault Isolation/Correction: 

1. R/W cable to PCM 

2. ECM 
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A8 Unable to Force Read and Write Fault While Writing 

Error Type: DF 

Error Description: Drive-resident diagnostics were unable to force a read/write-while-faulted 
error condition. 

Fault Isolation/Correction: 

1. ECM 

A9 Servo Fault/Force Fault Test 

Error Type: DF 

Error Description: A servo check occurred while the diagnostic firmware was attempting to 
execute the force fault subtest. 

Fault Isolation/Correction: 

1. ECM 

2. HDA 

AB Forced Read and Write Fauit Whiie Reading 

Error Type: DF 

Error Description: Drive-resident diagnostics were unable to force a read/write-while-faulted 
error condition. 

Fault Isolation/Correction: 

1. ECM 

AD UART Overrun or Framing Error 

Error Type: DE 

Error Description: The master processor internal UART detected an overrun condition or a 
framing error condition on data received from the OCR 

Fault Isolation/Correction: 

1. OCP 

2. ECM 

3. Blower/bezel assembly 

AE OCP Data Packet Checksum Error 

Error Type: DE 

Error Description: Data packets transmitted between the master processor and the OCP 
processor are in error. 

Fault Isolation/Correction: 

1. ECM 

2. OCP 

3. Blower/bezel assembly 



DIGITAL INTERNAL USE ONLY 



5-96 Troubleshooting and Error Codes 

AF OCP Start Byte is Not a Syne Character 

Error Type: DE 

Error Description: The first byte the master processor expects in a data packet transfer is a 
sync character. This error indicates no sync character was received. 

Fault Isolation/Correction: 

1. ECM 

2. OCP 

3. Blower/bezel assembly 

BO OCP Invalid Response 

Error Type: DE 

Error Description: The OCP processor did not acknowledge a command from the master 
processor. 

Fault Isolation/Correction: 

1. OCP 

2. ECM 

3. Blower/bezel assembly 

B2 OCP Retransmit Failure 

Error Type: DE 

Error Description: Hie OCP processor can request three retransmits from the master 
processor. This error indicates the OCP requested more than three consecutive retransmit 
responses. 

Fault Isolation/Correction: 

1. OCP 

2. ECM 

3. Blower/bezel assembly 

B3 OCP Command Unsuccessful 

Error Type: DE 

Error Description: An incorrect response was received from the OCP processor after the 
master processor issued a SEND STATUS command to the OCP. 

Fault Isolation/Correction: 

1. OCP 

2. ECM 
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B4 OOP Command Timeout 

Error Type: DE 

Error Description: The OCP processor did not respond to a master processor command 
within its allotted timeout period. As a result of this error, the master processor logs a B6 error 
condition into EEPROM and latches B4 into the display. 

Fault Isolation/Correction: 

1. OCP 

2. ECM 

3. Blower/bezel assembly 

B0 Master Processor UART Loopback Test Failure 

Error Type: DF 

Error Description: Drive-resident diagnostics were unable to transmit and receive data 
through the master processor serial communications interface (SCI). 

Fault Isolation/Correction: 

1. ECM 

B8 Master Processor UART Transmitter/Receiver Error 

Error Type: DE 

Error Description: The OCP failed to transmit or receive data through its master processor 
serial port. 

Fault Isolation/Correction: 

1. OCP 

B9 OCP-to-Master Processor Communications Timeout Failure 

Error Type: OCP Error Code 

Error Description: The master processor failed to communicate with the OCP processor 
within 4 seconds after power-up. 

Fault Isolation/Correction: 

1. OCP 

2. ECM 

3. Blower/bezel assembly 

BA OCP Nil Timeout Failure 

Error Type: OCP Error Code 

Error Description: The master processor failed to communicate with the OCP processor 
within 4 seconds after issuing an initialize request to the OCP processor. 

Fault Isolation/Correction: 

1. OCP 

2. ECM 

3. Blower/bezel assembly 
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BB OCP Processor ROM Checksum Failure 

Error Type: OCP Error Code 

Error Description: The OCP processor performed a ROM checksum, and the calculated 
checksum did not match the stored checksum. 

Fault Isolation/Correction: 

1. OCP 

BC Cartridge Checksum Failure 
Error Type: DF 

Error Description: Invalid microcode was detected in the microcode update cartridge. 
Fault Isolation/Correction: 

1. Reseat update cartridge (retry T40) 

2. Defective cartridge 

3. OCP 

4. ECM 

BD Microcode Update Cartridge Detection Failure 

Error Type: DF 

Error Description: The microcode update utility (T40) was attempted without an update 
cartridge in place. 

Fault Isolation/Correction: 

1. Cartridge is not inserted 

2. Defective cartridge 

3. OCP 

4. ECM 

BE Cartridge/EEPROM/Master Processor Consistency Check 

Error Type: DF 

Error Description: Microcode contained within the cartridge is inconsistent with the 
microcode in the master processor, EPROM, or EEPROM. The microcode update process is 
halted to prevent loading incompatible microcode. The product revision matrix documentation 
shows which codes are compatible. 

Fault Isolation/Correction: 

1. Reseat update cartridge 

2. Replace update cartridge with a compatible cartridge 

3. ECM 
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BF Error Log Write Compare Error 

Error Type: DE 

Error Description: Each time the drive writes an error log entry into the error silo, it verifies 
the data written. The microcode got a data compare error on the page (16 bytes) that was 
written. This is not a fatal error. Should that particular silo entry be rewritten, it may or may 
not fail again. This error code is not written to EEPROM but may be displayed at the time of 
the error if the fault button is depressed. 

Fault Isolation/Correction: 

1. ECM 

CO Hardware Revision and Microcode incompatibility 

Error Type: DE 

Error Description: The microcode has determined that there is an incompatible hardware 
and/or software combination from the revision information that it has visibility to. The 
microcode looks at the following hardware revisions in a drive: 

• I/O-R/W module hardware revision jumpers 

• Servo module hardware revision jumpers 

• PCM switch pack (Sl-1 through Sl-4) 

• HDA revision bits information 

Most of this hardware revision information can be determined by executing drive internal test 
T45 (see Chapter 4), then decoding the reported revision information. 

The microcode, after checking this internal revision information, will modify the final drive 
reported hardware revision that is reported to the subsystem and host as the drive hardware 
revision. 

Microcode revision 9 was the first release that checked for HDA revision. Subsequent microcode 
revisions have been expanding on the compatibility testing. With the RA92 (microcode revisions 
20 and later), a significant amount of revision checking/testing is necessary for the microcode to 
properly configure itself as to the type of drive (RA90 vs RA92), type of HDA (short arm vs long 
arm), type of format (RA90 vs RA92), and type of ECM (70-22942-01 vs 70-22942-02). 

To determine TOTAL compatibility, you must verify: 

• Code compatibility to ECM 

• Code compatibility to HDA 

• ECM compatibility to HDA 

• PCM and HDA compatibility 

• PCM switch pack setup. 

Reference the compatibility tables in Chapter 3. 

With microcode revisions 20 and later, the CO LED error is a a very significant mult to the 
drive and must be resolved. The error type was redefined to a drive error. 

Fault Isolation/Correction: If the HDA has just been replaced, replace it again with a 
compatible revision or load compatible drive microcode in the ECM. 



DIGITAL INTERNAL USE ONLY 



5-100 Troubleshooting and Error Codes 



If the drive HDA and microcode were operational before the failure, then revision bits are now 
being detected in error. This will require careful troubleshooting. A series of tables in the 
RA90/RA92 Disk Drive Pocket Reference Card have been prepared to assist in the determining 
and resolving of this error condition. Additional tables are provided in Chapter 3. 

1. If the HDA has just been replaced: load compatible microcode 

2. If the PCM has just been replaced: check PCM switch pack Sl-1 through SI— 4 for correct 
switch settings. Refer to the RA90/RA92 Disk Drive Pocket Reference Card and the tables 
in Chapter 3. 

3. If the ECM has just been replaced: check microcode compatibility. Refer to the RA90/RA92 
Disk Drive Pocket Reference Card and the tables in Chapter 3. 

4. R/W cable 

5. PCM 

6. ECM 

CI Outer Guardband Detected After HEAD LOAD Command 

Error Type: DF 

Error Description: The GASP gate array detected outer guardband after a HEAD LOAD 
command. 

Fault Isolation/Correction: 

1. ECM 

C2 Inner Guardband Detected After HEAD LOAD Command 

Error Type: DF 

Error Description: The GASP gate array detected inner guardband after a HEAD LOAD 
command. 

Fault Isolation/Correction: 

1. ECM 

C3 Seek to Outer Guardband Failed 

Error Type: DF 

Error Description: The servo processor was issued a SEEK command to the outer guardband 
area of the disk but foiled the seek. 

Fault Isolation/Correction: 

1. Clean cabinet air vent grill 

2. ECM 

3. PCM 

4. Blower/bezel assembly 

5. HDA 
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C4 Seek to Outer Guardband Not Detected 

Error Type: DF 

Error Description: The servo processor was issued a SEEK command to the outer guardband 
area of the disk, but the OGB flag was not detected. 

Fault Isolation/Correction: 

1. ECM 

C5 HDA and ECM Incompatibility 

Error Type: DF 

Error Description: The microcode has determined that the reported HDA type and ECM type 
are incompatible. Specifically, the incompatible combination is an old ECM type (70-22942-01) 
and an RA92 HDA. 

Microcode revision 25 was the first release to check specifically for this error. Previous 
microcode revisions (revision 9 and later) will report this condition as error code CO. 

Fault Isolation/Correction: If the HDA or ECM has just been replaced, make sure compatible 
part numbers have been used. 

If the PCM has just been replaced (part of the HDA FRU assembly), make sure switches SI— 1 
through Sl-4 are set correctly. (See Chapter 3 or compatibility tables in the RA90/RA92 Disk 
Drive Pocket Reference Card.) 

If HDA, PCM, ECM and drive microcode were operational before the failure, then the switch 
pack SI on the PCM and/or the I/O-R/W and servo revision jumpers are now being detected 
in error. This will require careful troubleshooting. See drive error code CO for additional 
troubleshooting information. 

1. R/W cable 

2. PCM (check SI switch pack setting) 

3. ECM (replace with P/N 70-22942-02) 

C6 PLO Failure 

Error Type: DE 

Error Description: The voltage controlled oscillator (VCO) is not synchronized to the 
dedicated servo information read from the media. 

Fault Isolation/Correction: 

1. ECM 

2. PCM 

3. HDA 

C7 Seek to Inner Guardband Failed 

Error Type: DF 

Error Description: The servo processor was issued a SEEK command to the inner guardband 
area of the disk but failed the seek. 
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Fault Isolation/Correction: 

1. ECM 

2. PCM 

3. HDA 

C8 Inner Guardband Not Detected After Seek to Inner Guardband 

Error Type: DF 

Error Description: A SEEK command, issued to the servo processor to seek to the inner 
guardband area, failed to detect the inner guardband flag. 

Fault Isolation/Correction: 

1. ECM 

2. HDA 

C9 Analog Loop Test Failure 

Error Type: DE 

Error Description: The D/A and A/D circuitry did not respond correctly while tested in a loop. 
The servo processor performs the analog testing on these circuits. 

Fault Isolation/Correction: 

1. ECM 

2. PCM 

CA Media Not Spinning 

Error Type: DF 

Error Description: Selected drive-resident diagnostics could not be executed because the 
drive was spun down. 

Fault Isolation/Correction: 

1. Spin up drive 

2. ECM 

CC Servo Processor Recalibrate Failed 
Error Type: DE 

Error Description: A RECALIBRATE command issued to the servo processor failed. 
Fault Isolation/Correction: 

1. ECM 

2. PCM 

3. HDA 
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CD Track Counter (Gray Code) 

Error Type: DE 

Error Description: During coarse positioning, both gray code bits (X and Y) changed during 
the same servo frame; or the same gray code changed (X or Y) during two consecutive servo 
frames. 

Fault Isolation/Correction: 

1. HDA 

2. ECM 

CE EEPROM Write Cycle Timeout 

Error Type: DE 

Error Description: During an EEPROM write operation, a location in EEPROM could not be 
written within 20 milliseconds. 

Fault Isolation/Correction: 

1. ECM 

CF Invalid Data In EEPROM 
Error Type: DE 

Error Description: Error information in EEPROM was found to be invalid. 
Fault Isolation/Correction: 

1. ECM 

E0 Spindle Rotation Not Detected 

Error Type: DE 

Error Description: The servo system has not detected Hall sensor signal transitions. This 
indicates either the spindle motor is not turning or the Hall sensor circuitry has failed. An open 
motor coil (or drive circuitry) will show this symptom if that particular phase is needed to start 
the spindle drive. See error code 13 before replacing FRUs. 

With microcode revisions 19 and earlier, this error was spindle speed unsafe — basically the same 
error detection. 

After microcode revision 20, this error is simply failure to detect that the spindle has performed 
any motion. The servo monitors the hall sense SI signal (reference error code 13). If it detects 
any transition on this specific motor control signal, then this check is okay. 

Fault Isolation/Correction: 

1. ECM 

2. Rear flex cable assembly (visually inspect for damage (HDA removal necessary); the rear 
flex cable assembly should be neatly dressed along the sides of the chassis at the rear) 

3. Servo-to-spindle motor interconnect 

4. Brake failure (on/open all the time) 

5. HDA 

6. Rear flex cable assembly 
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E1 Spindle Speed Out Of Range 

Error Type: DE 

Error Description: Spindle speed is monitored initially by input from the Hall sensors inside 
the HDA spindle motor. Improper spindle speed, as detected by the Hall sensors, may prevent 
proper speed control until the PLO frequency lock range is attained. Once the spindle speed 
is within the PLO range, the servo system begins to look for servo data in which to lock its 
frequency to. This error implies that the drive is unable to establish spindle speed rotation 
within the range required (RA90=3600 rpm, RA92=3405 rpm). 

An open failure of a spindle motor coil winding, or a motor drive circuitry failure, or a bad hall 
sense SI or S2 circuit will cause this type of error. See error code 13 for measurement points 
and troubleshooting aids before replacing FRUs. 

Fault Isolation/Correction: 

1. Rear flex cable assembly (visually inspect for damage (HDA removal necessary); the rear 
flex cable assembly should be neatly dressed along the sides of the chassis at the rear) 

2. ECM 

3. Continuity checks (refer to Table 5—10) 

4. HDA 

E2 A/D or D/A Converter Insane 

Error Type: DE 

Error Description: The servo processor detected a failure in its A/D or D/A converters during 
a precheck before the head load was initiated. 

Fault Isolation/Correction: 

1. ECM 

2. If you load microcode revision 13 (or earlier) into a 70-22942-02 (RA92-compatible) ECM, a 
solid E2 error will be seen upon drive spinup. 

E3 Excessive Positioner Current During Test 

Error Type: DE 

Error Description: The servo processor detected a failure in the power amp circuitry that 
indicates a shorted condition. 

Fault Isolation/Correction: 

1. ECM 

E4 Open Circuit Detected During Power Amp Toggle Test 

Error Type: DE 

Error Description: An open was detected in the power amp circuitry during a head load 
sequence. Power is applied to the positioner in a toggle fashion during the head load sequence. 

Reference error code 13 for information that may be useful for isolating an open circuit of the 
actuator. An ohmmeter measurement might verify this condition at the HDA. 
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Fault Isolation/Correction: 

1. Rear flex cable assembly (visually inspect for damage (HDA removal necessary); the rear 
flex cable assembly should be neatly dressed along the sides of the chassis at the rear) 

2. ECM 

3. HDA 

E5 Overcurrent Detected During Actuator Test 

Error Type: DE 

Error Description: The servo processor detected an overcurrent condition before attempting a 
head load process. 

Fault Isolation/Correction: 

1. ECM 

E6 Track Counter Clear Failure 

Error Type: DE 

Error Description: The track counter failed to clear indicating establishment of cylinder 0. 
This is the final phase of the head load/RTZ process that must be accomplished. 

Loss of PLO during this portion of the head load/RTZ process will also cause this error. See the 
note in the error description for error code E8. 

Fault Isolation/Correction: 

1. ECM (most likely) 

2. PCM 

3. HDA 

E7 Illegal Zone Detected 

Error Type: DE 

Error Description: The servo system is executing a head load or RTZ operation. 

For microcode revisions 19 and earlier, the order of band detection is: outer guardband, data 
area, then inner guardband area. 

For microcode revisions 20 and later, the order of band detection that the servo system is 
looking for is OGB, data area, then back to OGB. In this case (without an E9 error), the servo 
system could not re-establish finding the OGB area (the second time). The servo system will 
spend up to one second trying to re-establish the OGB area. 

Loss of PLO during this portion of the head load/RTZ process will also cause this error. See the 
note in the error description for error code E8. 

Fault Isolation/Correction: 

1. ECM (most likely) 

2. HDA 

3. PCM 
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E8 Outer Guardband Timeout 

Error Type: DE 

Error Description: Servo is in the outer guardband (OGB) of the disk and wants to be able to 
detect this region by looking for the OGB pattern from the dedicated servo information. At this 
time, however, the servo cannot establish PLO lock and faults. Interruption of the servo data 
stream is likely. Up to 3.4 seconds is allocated to trying to find servo data. 

NOTE 

PLO Loss During Head Load/RTZ: The PLO coming unlocked is a fairly serious error 

to a servo system. It causes all the servo information to become unreadable. There 

are now four different codes for the PLO being unlocked, depending on when it 

happens: 

• At the beginning of RTZ, if unable to establish lock, an E8 is reported. 

• Midway through the RTZ, if lock is lost while scanning the disk for the OGB, an 
E7 is reported. 

• Late in the RTZ, while going from the OGB to cylinder 0, lost lock results in an E6. 

• During normal track following and seeking, lost lock causes an EC. 

These are the error codes reported by the servo and logged in the error log while 
functional I/O code is running. Diagnostic I/O code may log (and the OCP may 
display) the I/O's error code of C6 for a PLO failure. 

Fault Isolation/Correction: 

1. ECM 

2. PCM 

3. HDA 

E9 Gray Code Timeout During the Turnaround State 

Error Type: DE 

Error Description: No gray code transitions were detected during a hold sequence. The 
drive is attempting a head load (NRZ), is in the OGB, and has PLO locked, reading its OGB 
position. At this point, the servo is attempting to move forward to look for track crossings and 
the eventual detection of the data area of the disk. However, the servo cannot get the positioner 
to move. The servo will spend up to 3.4 seconds trying to move the positioner. 

A sticky (dragging) actuator lock pin or faulty actuator lock solenoid will also cause this error. 

Fault Isolation/Correction: 

1. HDA (positioner lock solenoid failure — see error code 13) 

2. Rear flex cable assembly (visually inspect for damage (HDA removal necessary); the rear 
flex cable assembly should be neatly dressed along the sides of the chassis at the rear) 

3. ECM 
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EA Gray Code Timeout During Outer Guardband State 
Error Type: DE 

Error Description: No gray code transitions were detected during a head load sequence. 
Fault Isolation/Correction: 

1. Visually inspect rear flex cable assembly 

2. HDA (positioner lock solenoid failure — see error code 13) 

3. ECM 

E@ Sector Pulse Timeout During Syne-Up State 

Error Type: DE 

Error Description: An index pulse was detected but no sector pulse was detected in Hie data 
area of the disk. Heads may not be positioned over the data area. 

Fault Isolation/Correction: 

1. ECM 

2. HDA 

EC Servo Fault and PLO Fault Bit Set In GASP 

Error Type: DE 

Error Description: The servo fault and PLO fault bits are both set in the GASP, but it 
was noted by the servo processor that the PLO had come unlocked. Similar to error code 25, 
however, the servo processor did see the PLO deassert, which in turn caused the servo fault bit 
to set. 

Fault Isolation/Correction: 

1. ECM 

2. PCM 

3. HDA 

ED Servo Watchdog Timeout 

Error Type: DE 

Error Description: The digital signal processor (DSP) was not interrupted on time by the 
GASP. Possibly, the servo clock signal is not present or is not being detected properly. The 
timeout is 820 microseconds. 

Fault Isolation/Correction: 

1. ECM 

EE Servo Digital Signal Processor Reset 

Error Type: DE 

Error Description: The Servo DSP has been reset. As a result, the profiles for the drive have 
not been loaded by the master processor. The DSP is sane, but has not been told what type 
of HDA is present in the drive — it may be an RA90 long arm, RA90 short arm, or an RA92. 
Therefore, the servo will not load its servo tables or move the actuator. This is an unusual 
error condition. The master processor should have reinitialized the drive characteristics into 
the servo system. 
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Fault Isolation/Correction: 

1. Turn drive power off and on 

2. ECM 

EF Head Unload Failed 

Error Type: DE 

Error Description: The servo processor responded with an error condition to a HEAD 
UNLOAD command. 

Fault Isolation/Correction: 

1. ECM 

2. HDA 

F0 Servo licrocode Update Failed 

Error Type: DE 

Error Description: The servo processor did not send a SUCCESSFUL acknowledgment when 
the master processor attempted to load external servo processor RAM with new microcode. 
When the drive powered up, a microcode update occurred or a servo timeout took place. The 
master processor did a compare of EEPROM to RAM microcode. The data did not compare. 

Fault Isolation/Correction: 

1. I/O-R/W to servo cable connection 

2. ECM 

F1 Command to Servo Processor Timed Out 

Error Type: DE 

Error Description: The master processor attempted to issue an UNLOAD command to the 
servo processor; however, the command timed out during its execution. 

Fault Isolation/Correction: 

1. ECM 

F3 Servo Spinup Failed 

Error Type: DE 

Error Description: The master processor issued a SPINUP command to the servo processor 
and the servo processor responded with an error condition. 

Fault Isolation/Correction: 

1. ECM 

2. Brake assembly 

3. HDA 
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F4 Servo Spindown Failed 

Error Type: DE 

Error Description: The master processor issued a command to spin down the drive. The 
servo processor responded with an error condition. 

Fault Isolation/Correction: 

1. ECM 

F5 Seek Failed 

Error Type: DE 

Error Description: The servo processor returned an error condition in response to a SEEK 
command from the master processor. 

NOTE 

T65 does not check for out-of-range values. Do not exceed the maximum specified 
input values. Also, the last cylinder parameter must always be equal to or greater 
than the first cylinder parameter. If an invalid cylinder value is entered, a (servo) 
Seek Failed error (F5) occurs. 

Fault Isolation/Correction: 

1. HDA 

2. ECM (if 10 of 13 heads) 

F6 Head Switch Failed 

Error Type: DE 

Error Description: The servo processor responded with an error condition to a HEAD 
SWITCH command initiated by the master processor. 

Fault Isolation/Correction: 

1. HDA 

2. ECM (if 10 of 13 heads) 

F7 RTZ Failed 

Error Type: DE 

Error Description: The master processor issued a RETURN TO ZERO (RTZ) command to the 
servo processor, and the servo processor responded with an error condition. 

Fault Isolation/Correction: 

1. ECM 

2. HDA 

F8 Head Load Failed 

Error Type: DE 

Error Description: The master processor issued a HEAD LOAD command to the servo 
processor, and the servo processor responded with an error condition and no specific error 
information with it, or the head load timed out. 
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Fault Isolation/Correction: 

For microcode revisions 19 or earlier: 

1. ECM (if 10 of 13 heads) 

2. PCM 

3. HDA 

For microcode revisions 20 or later: 
1. ECM 

F9 Diagnostic Command Failed 

Error Type: DF 

Error Description: The servo processor responded with an error or a timeout condition to a 
DIAGNOSE command issued by the master processor. 

Fault Isolation/Correction: 

1. ECM 

2. PCM 

3. HDA 

FA Servo Processor Failed Seek to DGN Write Cylinder 

Error Type: DF 

Error Description: A seek to the diagnostic (DGN) write/read cylinder failed while under 
diagnostics control. 

Fault Isolation/Correction: 

1. ECM 

2. PCM 

3. HDA 

FB Servo Processor Failed Seek to DGN Read Cylinder 

Error Type: DF 

Error Description: A seek to the diagnostic (DGN) read-only cylinder failed while under 
diagnostics control. 

Fault Isolation/Correction: 

1. ECM 

2. PCM 

3. HDA 
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FD EEPROM Checksum Error 
Error Type: DF 

Error Description: An incorrect checksum was detected in one of the EEPROMs. 
Fault Isolation/Correction: 

1. Reload microcode 

2. ECM 
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6 

Removal and Replacement Procedures 



ft 1 IntroHnrtirin 
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This chapter describes the removal and replacement procedures for RA90 and RA92 disk drive 
components. No tools are needed to remove or replace the six major field replaceable units (FRUs) 
that make up the RA90/RA92 disk drive. However, tools are required for the removal and/or 
replacement of some drive components. A tools checklist is included to identify these tools. Tools 
are also identified in procedures where needed. 

Figure 6-1 shows an exploded view of the RA90/RA92 disk drive. 
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6-2 Removal and Replacement Procedures 
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Figure 6-1 RA90/RA92 Disk Drive — Exploded View 
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6.2 Sequence for FRU Removal 

Remove RA90/RA92 FRUs in the following sequence: 



CABINET FRONT PANEL/DRIVE GRILL 
OCP 

BLOWER MOTOR ASSEMBLY 

ECM 

PCM 

— HDA 

— SPINDLE GROUND BRUSH 

— BRAKE 

— SOLENOID 



E 



CABINET REAR PANEL 

POWER SUPPLY 

CXO-2200A 



Figure 6-2 FRU Removal Sequence 

Use care when removing and replacing drive components. Never force fit drive modules or 
components. Generally, a steady, firm pressure and the correct alignment ensures proper seating of 
drive components. If you encounter resistance during FRU removal or replacement, check for bent 
pins, obstructions, or improper alignment of parts. 

6.3 Electrostatic Sensitivity 

Drive components and FRUs are highly sensitive to electrostatic shock. Use proper ESD methods 
when handling drive components. (Refer to Section 1.4, Electrostatic Protection.) 

6.4 Power Precautions 

Since hazardous voltages are present in this equipment, it is recommended that only trained service 
personnel attempt to service this equipment. 

WARNING 

Always remove power from the unit before removing or replacing any internal part or 

cable. Bodily injury or equipment damage may result from improper servicing. 

6.5 Tools Checklist 

Most RA90 and RA92 disk drive repairs can be performed without the use of tools. However, the 
following tools are required during some procedures: 

5/32 Hex wrench 

1/16 Allen wrench 

3/32 Allen wrench 

5/32 Allen wrench 

3/16 Allen wrench 
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• Pliers 

• Needlenose pliers 

• Medium Phillips screwdriver 

• Flat-blade screwdriver 

6.6 Removing/Replacing Cabinet Front and Rear Access Panels 

Procedures contained in this chapter require the removal of cabinet front and rear access panels. 
Panel removal and replacement procedures follow. 

6.6.1 Removing/Replacing the Front Access Panel 

lb remove the cabinet front access panel (refer to Figure 6-3): 

1. Use a hex wrench or flat-bladed screwdriver to unlock the two quarter-turn fasteners at the top 
of the panel. Turn the fasteners counterclockwise. 

2. Grasp the panel by its edges, tilt it toward you, and lift it up about 2 inches. Remove the panel 
and store it in a safe place. 

Tb reinstall the front access panel: 

1. Lift the panel into place and lower it straight down until the tabs on the panel's lower edge 
engage the slots in the cabinet support bracket. 

2. Holding the panel flush with the cabinet, use a hex wrench to lock the quarter-turn fasteners. 
Turn the fasteners clockwise. 

6.6.2 Removing/Replacing the Rear Access Panel 

lb remove the cabinet rear access panel (refer to Figure 6-4): 

1. Use a hex wrench or flat-bladed screwdriver to unlock the two quarter-turn fasteners at the top 
of the panel. Turn the fasteners counterclockwise. 

2. lilt the panel toward you and lift it up to disengage the pins at the bottom. 

3. Lift the panel clear of the enclosure and store it in a safe place. 
Tb reinstall the rear access panel: 

1. Lift the panel into place and fit the pins into the holes at the top of the I/O bulkhead. 

2. Push the top of the panel into place and use a hex wrench to lock the quarter-turn fasteners. 
Turn the fasteners clockwise. 
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Figure 6-3 Front Access Panel Removal 
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I/O BULKHEAD 



Figure &-4 Rear Access Panel Removal 

6.7 Removing the Operator Control Panel 

The operator control panel (OCP) is secured to the bezel/blower assembly by the OCP-to-blower 
connector and by flexible metal retention clips. 

NOTE 

Note the orientation of the OCP before removing. 

Tb remove the OCP (refer to Figure 6-5): 

1. Remove power from the drive. 

2. Grip the OCP in the middle and gently pull it towards you. 

3. Note OCP-to-blower connector orientation. 

Reverse this process to replace the OCP. (Check for bent pins before replacing.) 
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Rgure 6-5 OCP Removal 

6.8 Removing the Blower/Bezel Motor Assembly 

Although the bezel and blower motor assembly are removed as one unit from the drive chassis, the 
bezel and blower motor assembly are two separate units. The blower motor assembly is the FRU. 

Pay particular attention to the blower motor orientation and blower motor-to-ECM connection. 

lb remove the blower motor assembly (refer to Figure 6-6): 

1. Remove power from the drive. 

2. Remove the OCP (refer to Section 6.7). 

3. Note blower motor orientation before removing. 

4. Locate the four wing nuts. 

5. Rotate lower then upper wing nuts counterclockwise to loosen. 

6. Grasp the assembly sides and pull the assembly toward you. 
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Figure 6-6 Blower Motor Assembly Removal Sequence 

To replace the blower motor assembly. 

1. Ensure a good connection exists between the blower motor assembly and the ECM. 

2. Check for proper connector alignment. 

3. Use steady, gentle pressure to replace the blower motor assembly. Do not force the blower 
assembly into position. If resistance is encountered, check for bent pins. 

4. Tighten the upper and lower wing nuts in a clockwise direction. 
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6.8.1 Separating the Bezel and Blower Motor Assembly 

Use the following procedure to separate the blower motor assembly from the bezel (refer to 
Figure 6-7): 

1. Place the assembly grill-side down. 

2. Locate and disconnect the +24 V blower motor connector (red and black leads). 

3. Locate the Phillips-head screws; loosen and remove. 

4. Separate the bezel and blower motor assembly. 

Reverse this procedure to reconnect the bezel and blower motor assembly. Return the assembly to 
the chassis. 
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Figure 6-7 Bezel and Blower Motor Assembly Separation 



DIGITAL INTERNAL USE ONLY 



6-10 Removal and Replacement Procedures 

6.9 Removing the Electronic Control Module 

Ensure proper grounding before beginning this procedure. To remove the electronic control module 
(ECM) (refer to Figure 6-8): 

1. Remove power from the drive. 

2. Remove the OCP (refer to Section 6.7). 

3. Remove the blower motor assembly (refer to Section 6.8). 

4. Remove the ribbon cable from the preamp control module (PCM). 

5. Locate the lock/release lever on the side of the ECM. 

6. Grasp the ECM handle and apply pressure to the lock/release lever with your thumb. 

NOTE 

Do not use extreme force when applying pressure to the lock-release lever. Only firm, 

steady pressure is required to remove the ECM. 
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Figure $-8 ECM Removal 
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7. Pull the ECM toward the front of the chassis. 



8. If resistance is encountered, apply a small amount of back pressure to the ECM and, at the 
same time, apply pressure to the lock release lever. Pull the ECM toward the front of the 
chassis. 

Reverse this procedure to replace the ECM. Apply firm (not excessive) pressure until the carrier 
latch engages its detent. Reconnect the ECM-to-PCM ribbon cable. 

NOTE 

Do not force the ECM. If necessary, remove and examine rear connector pins to verify 
nothing is bent or jammed. In very extreme cases, it may be necessary to remove the SDI 
cables from the rear of the drive before inserting the ECM. 

6.1 Removing the Preamp Control Module 

It is not necessary to remove the HDA in order to remove the preamp control module (PCM). 

Refer to Figure 6-9 while performing this procedure. Ensure proper grounding before beginning 
PCM removal. 
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Figure 6-9 PCM Removal 
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1. Remove power from the drive. 

2. Remove the OCP (refer to Section 6.7). 

3. Remove the blower motor assembly (refer to Section 6.8). 

4. Remove the ribbon cable from the PCM. 

5. Remove the Phillips-head screws securing the PCM to the HDA. 

6. Note the orientation of the PCM-to-HDA connector. Place your fingers on the sides and near 
the PCM-to-HDA connector. Use steady, firm pressure to dismount the PCM from the HDA. 

Reverse this procedure to replace the PCM. Ensure proper alignment between the HDA and 
PCM-to-HDA connectors. (Check for bent pins prior to reinstalling.) 

6.1 1 Removing/Replacing the Head Disk Assembly 

This section documents the procedures for removing and replacing the HDA. Use extreme care 
during HDA removal/replacement procedures to prevent damage to the HDA. 

As with all static-sensitive components, ensure proper grounding when handling. Place components 
on a grounded, anti-static work surface. Prior to installation, a replacement HDA must be 
thermally stabilized. 

WARNING 

The thermal stabilization procedure is mandatory. Failure to thermally stabilize this 

equipment could cause premature equipment failure. 

6.11.1 Removing the HDA 

Run tests T43 and T44 before replacing the HDA, to capture seek and spinup information. Record 
this information on the red tag when returning the HDA. 

i 

Run tests T53 and T54 to clear stored parameters from the old HDA. 

WARNING 

An HDA weighs 15 kilograms (33 pounds). Use both hands during this procedure. 

The positioner/head assembly must never be rotated in a counterclockwise direction. 
Damage to the media and heads could occur. 

Place the HDA on a grounded, anti-static work surface after it has been removed. Use proper 
grounding techniques when working with drive components. 

To remove the HDA (refer to Figure 6-10): 

1. Remove power from the drive. 

2. Remove the OCP (refer to Section 6.7). 

3. Remove the blower motor assembly (refer to Section 6.8). 

4. Remove the ribbon cable from the HDA. 

5. Locate the baseplate latch assembly. 

6. lb unlock the HDA from the drive chassis, grasp the baseplate latch assembly and pull up and 
turn until the lock is in its top position. 

7. Grasp the HDA carrier handle and pull the HDA toward the front of the drive. 

8. Place one hand under the HDA as you remove it from the drive chassis. 
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Figure 6-10 HDA Removal 

9. If resistance is encountered, attempt to carefully reinsert the HDA and try this procedure again. 
It may be necessary to apply a small amount of back pressure before the HDA can be removed 
from the chassis. 

10. Place the HDA on a grounded, anti-static work surface. 

6.11.2 HDA Thermal Stabilization Procedure 

The replacement HDA must be thermally stabilized before its moisture barrier bag is opened. 

Prior to installation, a replacement HDA must be stored at a temperature of 16°C (60°F) or higher 
for a minimum of 24 hours. The HDA may be stored in the computer room or in another storage 
room under controlled temperature conditions. If stored in another storage room, the HDA must sit 
for an additional hour in the computer room in which it will be installed. 

CAUTION 

Under no circumstances should the HDA be left overnight in an uncontrolled 
temperature environment where cold temperatures could occur (for example, in a car) 
and then opened/installed without a 24-hour thermal stabilization period. 
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6.11.3 Replacing the HDA 

After the thermal stabilization criteria has been met, open the HDA box and carefully cut the 
heat-sealed end of the moisture barrier bag. Remove the desiccant from the moisture barrier bag 
and the HDA from the foam bag. Save all HDA packing material to repackage the failing HDA. 

Use the following procedure to install the replacement HDA: 

• Slide the HDA into the chassis until the spring-loaded latch locks into place. 

WARNING 

When reinserting the HDA into the drive chassis, take care not to pinch your fingers. 

There is limited clearance between the HDA handle and chassis edges. 

• Turn the baseplate latch assembly until the latch drops into place and the HDA is secure. To 
ensure the HDA is secure, try sliding the drive in and out of the chassis. 

• Reconnect the ECM-to-PCM ribbon cable. 

• Run tests T53 and T54 to clear stored (replaced) HDA-related information. 

Return the defective HDA in the replacement HDA's shipping package. Place desiccant inside the 
moisture barrier bag before folding and sealing the package. Tape the red tag to the outside of the 
sealed HDA package. 

6.11.4 Separating the HDA and Carrier 

A number of repairs require separating the HDA and carrier. Use the following procedure to 
accomplish this: 

1. Remove the HDA from the chassis and set it carrier-side up on a grounded, anti-static work 
surface (Section 6.11.1). 

2. Locate the rear HDA connector and remove the retaining C clips shown in Figure 6-11. 

NOTE 

Remove the C clips by pressing against the spring-loaded rear HDA connector and, 
at the same time, using a small, flat-bladed screwdriver or small needlenose pliers to 
loosen and remove the clips. 

3. Remove the rear HDA connector. 

4. Use a Phillips screwdriver to remove the two screws securing the HDA carrier to the damper 
bracket assembly. 

5. Completely loosen but do not remove the four Tbrx-head screws with a Tbrx T-15 screwdriver. 
Refer to Figure 6-11 for the location of the Tbrx-head screws. 

Reverse this procedure to reassemble the HDA and the carrier. 
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Figure 6-11 HDA Carrier Separation 
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6.11.5 Removing the Spindle Ground Brush 

This section documents the procedure for removing and replacing the RA90/RA92 spindle ground 
brush. Because handling the HDA is necessary, extreme caution must be used. 

Refer to Figure 6-12 during this procedure. 
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Figure 6-12 Spindle Ground Brush Removal 
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1. Remove power from the drive. 

2. Remove the OCP (refer to Section 6.7). 

3. Remove the blower motor assembly (refer to Section 6.8). 

4. Disconnect the ribbon cable from the PCM. 

5. Remove the HDA from the chassis (Section 6.11.1) and set it on a grounded, anti-static work 
surface, carrier-side up. 

6. Locate the rear HDA connector and remove the retaining C clips shown in Figure 6-11. 



Remove the C clips by pressing against the spring-loaded rear HDA connector and, 
at the same time, ng^^g a small, flat-bladed screwdriver or «w*»11 needlenose pliers to 
loosen and remove the clips. 

7. Remove the rear HDA connector. 

8. Use a Phillips screwdriver to remove the two screws securing the HDA carrier to the damper 
bracket assembly (refer to Figure 6-11). 

9. Loosen the four Torx-head screws with a Torx T-15 screwdriver. Refer to Figure 6-11 for the 
location of the Torx-head screws. 

10. Separate the HDA and carrier (refer to Section 6.11.4). 

11. Locate and remove the spindle ground brush cover shown in Figure 6-12, 

12. Locate and remove the spindle ground brush by removing the two hex-head screws that hold it 
in place. 

Replace the ground brush then reassemble the HDA and drive assemblies. 

6.11.6 Removing the Brake Assembly 

This section documents the procedures for removing and replacing the RA90/RA92 brake assembly. 
Because handling the HDA is necessary, extreme caution must be used. 

You will need a contact extraction tool (Digital Part Number 29—26655-00) to perform this 
procedure. Refer to Figures 6-12, 6-13, and 6-14 while performing this procedure. 

CAUTION 

Never rotate the actuator or positioner shaft counterclockwise. HDA damage could 

occur. 

1. Remove power from the drive. 

2. Remove the OCP (refer to Section 6.7). 

3. Remove the blower motor assembly (refer to Section 6.8). 

4. Disconnect the ribbon cable from the PCM. 

5. Remove the HDA from the chassis (Section 6.11.1) and set it on a grounded, anti-static work 
surface, carrier side up. 

6. Locate the rear HDA connector and remove the retaining C clips shown in Figure 6-11. 

NOTE 

Remove the C clips by pressing against the spring-loaded rear HDA connector and, 
at the same time, using a small, flat-bladed screwdriver or small needlenose pliers to 
loosen and remove the clips. 



DIGITAL INTERNAL USE ONLY 



6-18 Removal and Replacement Procedures 



REAR HDA 
CONNECTOR 




HANDLE 



CONTACT 
CAVITY 



LANCE 

RELEASE 

TIP 



CXO-2181B 



Figure 6-13 Contact Extraction Tool 

7. Remove the rear HDA connector. 

8. Use a Phillips screwdriver to remove the two screws securing the HDA carrier to the damper 
bracket assembly (refer to Figure 6-11). 

9. Loosen the four Tbrx-head screws with a Torx T-15 screwdriver. Refer to Figure 6-11 for the 
location of the Tbrx-head screws. 

10. Separate the HDA and carrier (refer to Section 6.11.4). 

11. Locate and trace the brake electrical contacts to the rear HDA connector. 

12. Extract the brake electrical contacts (contacts 4 and 5) from the rear HDA connector using the 
contact extraction tool from the kit. 

13. Align the contact extraction tool with the front of the connector. Align the lance release tip 
with the lance release slot, making sure to align the tip with the contact cavity. Refer to 
Figure 6-13. 
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Figure 6-14 RA90/RA92 Brake Assembly Removal/Replacement 

14. Push the lance release tip in until the locking lance (metal tip inside contact pin) is released 
from the slot. 

15. Hold the connector firm and push the handle of the contact extraction tool forward. The contact 
should hack out of the rear of the connector. 

16. Remove the contact extraction tool and pull the brake contact from the back of the connector. 

17. Locate and remove the spindle ground brush cover (refer to Figure 6-12). 

18. Locate and remove the spindle ground brush. 

19. Use a 5/32 Allen wrench to remove brake hold-down screws (refer to Figure 6-14). 

20. Note the hex shape of the spindle and matching hex shape of the brake hub. 
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21. Orient the brake hub to the spindle and fit them together. Do not rotate spindle 
counterclockwise. 

22. Secure the brake to the baseplate with the brake hold-down screws. Refer to Figure 6-14. 

23. Replace the spindle ground brush. 

24. Reinstall the spindle ground brush cover. 

25. Insert brake electrical contacts into slots 5 and 6 in the connector. (Ensure a secure fit by 
tugging on leads.) 

26. Reassemble the HDA to the HDA carrier. 

27. Attach the rear HDA connector and C clips. 

28. Reassemble the drive. 

29. Install the HDA into the drive chassis. 

6.11.7 Spindle Lock Solenoid Failure 

This section covers solenoid failures. The solenoid is not a replaceable FRU; however, its failure 
prevents the heads from loading and data from being recovered. 

Tb preclude the loss of data because of a solenoid failure, this procedure allows you to bypass the 
solenoid long enough to recover the data and back it up onto another disk drive or tape unit. 

CAUTION 

Attempt this procedure only under the worst possible situations; that is, if customer 
backup data is not current or work in progress must be recovered. After performing this 
procedure and recovering the data, replace the HDA according to Section 6.11.3. 

Refer to Figure 6-15 while performing this procedure. 

1. Remove power from the drive. 

2. Remove the OCP (refer to Section 6.7). 

3. Remove the blower motor assembly (refer to Section 6.8). 

4. Remove the HDA from the chassis (Section 6.11.1) and set it on a grounded, anti-static work 
surface, carrier side up. 

5. Locate the rear HDA connector and remove the retaining C clips shown in Figure 6-11. 

NOTE 

Remove the C clips by pressing against the spring-loaded rear HDA connector and, 
at the same time, using a small, flat-bladed screwdriver or small needlenose pliers to 
loosen and remove the clips. 

6. Remove the rear HDA connector. 

7. Use a Phillips screwdriver to remove the two screws securing the HDA carrier to the damper 
bracket assembly. 

8. Loosen the four Tbrx-head screws with a Tbrx T-15 screwdriver. Refer to Figure 6-11 for the 
location of the Torx-head screws. 

9. Separate the HDA and carrier (refer to Section 6.11.4). 

10. Locate the solenoid (refer to Figure 6-15). 
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Figure 6-15 Disabling the Solenoid for In-Field Data Recovery 
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11. Disconnect the electrical leads from the solenoid and place electrical tape over the lead contacts 
to prevent shorting. 

12. Loosen and remove the positioner lock solenoid hold-down screws with a T-15 Torx wrench. 

13. Remove the solenoid and set it aside. 

14. Reinstall the solenoid hold-down screws to the baseplate and tighten slightly. 

15. Loop a piece of 20-gauge wire (or equivalent) approximately 6 inches long through the solenoid 
armature as shown in Figure 6-15. 

16. Secure one end of the wire around one of the solenoid hold-down screws and tighten the screw 
securely onto the wire. 

17. After looping the wire through the solenoid armature, gently pull the solenoid plunger away 
from the positioner/actuator assembly until it stops (approximately a quarter inch). 

18. Loop the loose end of wire around the second hold-down screw and tighten the screw securely 
onto the looped wire. 

CAUTION 

Ensure both sides of the wire are secure and that the solenoid plunger is held back. 
The aim of this procedure is to recover customer data. If the solenoid plunger slips 
back, it will cause the solenoid armature to allow the positioner/actuator assembly to 
lock. Data recovery will then be unsuccessful. 

Reassemble the HDA, carrier, and drive. 

After data has been recovered, replace the HDA according to the HDA replacement procedure in 
Section 6.11.3. When returning the old HDA from the field, also return the failed solenoid. 

6.1 2 Removing the Power Supply 

This section documents the procedures for removing and replacing the RA90/RA92 power supply. 

Ensure you have removed power from the correct drive. Proceed with caution whenever working 
with high voltages. Refer to Figure 6-16 while performing this procedure. 

WARNING 

When removing and replacing drive components, take care not to pinch your fingers. 

There is limited clearance between the HDA handle and chassis edges. 

1. Spin down the drive. 

2. Turn off the drive circuit breaker to remove power from the drive. 

3. Note port cable connector locations when removing the power supply. 

4. Remove the power cord from the rear of the drive. 

5. Remove other cables that may interfere with the power supply removal. 

6. Loosen the bottom two quarter-turn fasteners by turning in a counterclockwise direction. 

7. Support the bottom of the power supply with one hand. 

8. Loosen the top two quarter-turn fasteners by turning in a counterclockwise direction. 

9. Remove the power supply. 
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Figure 6-1 6 Power Supply Removal 

CAUTION 

The power supply weighs approximately 6.8 kilograms (15 pounds). It must be supported 

when being removed from the drive. 

Reverse this process to replace the power supply. Check the line voltage selector switch to ensure 
you have the correct voltage for your area. 

6.1 3 Removing/Replacing the Rear Flex Cable Assembly 

This section documents the procedures for removing and replacing the RA90 and RA92 rear flex 
cable assembly. To facilitate the removal of the rear flex cable assembly, first remove the drive 
HDA, power supply, and ECM. After these drive components have been removed, remove the drive 
chassis from the cabinet and place it on a grounded, anti-static work surface. 
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To remove the rear flex cable assembly (refer to Figure 6-17): 

1. Loosen the four Allen screws holding the rear panel assembly to the drive chassis. 

2. Remove the 15 contact springs. Set the contact springs aside. 

3. Remove the four Allen screws and set the rear panel assembly aside. (Set aside the drive serial 
number label bracket.) 
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Figure 6-17 Rear Flex Cable Assembly Removal 

The next step requires the removal of the rear flex cable assembly. There are a number of adhesive- 
backed cable clamps used to secure the rear flex cable assembly in place. The cable clamps all 
open toward the rear of the drive with one exception; locate and remove this "one" cable clamp to 
facilitate removal of the rear flex cable assembly. (See Figure 6-17 for the location of this clamp.) 

4. Remove the two Allen-head screws that secure the green male rear connector to its bracket. 

5. Remove the two C clips that secure the black ECM female connector to its bracket. 

6. Remove the rear flex cable assembly. 
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The next step requires the replacement of the rear flex cable assembly. Lay the replacement rear 
flex cable assembly out next to the one being replaced. Set the dip switches on the new rear flex 
cable assembly to the exact settings from the replaced one. By hand, bend the rear flex cable 
assembly 90 degrees in the same places as the original assembly. 

NOTE 

Future flex cable assemblies may use dip shunt switch packs rather than dip switch 

packs. A shunt open = switch open or off. 

7. Place the rear flex cable assembly on the rear panel assembly with the two connectors on their 
proper brackets. 

8. Secure the green male rear connector to its bracket with the two (previously removed) Allen- 
head screws. 

9. Secure the black female ECM connector to its bracket with the two (previously removed) C dips. 

10. Replace the previously removed adhesive-backed cable clamp. 

11. Loosely attach the rear panel assembly to the rear of the drive chassis. 

12. Replace the 15 contact springs. 

13. Secure the rear panel assembly by tightening the Allen screws. 

14. Return the drive chassis to the cabinet. 

15. Return the drive components to the drive chassis. 

6.1 4 Media Removal Service for Customers 

The on-site media removal and disposal service is an exclusive Digital Customer Services offering. 

The following tools are needed to remove drive media from the HDA Digital part numbers for these 
tools are listed in Table 6-1: 

1. 1/16 Allen wrench 

2. 3/32 Allen wrench 

3. 5/32 Alien wrench 

4. 3/16 Allen wrench 

5. Torx size T-15 wrench 

6. Tbrx size T-15 socket wrench 

7. Pliers 

8. Diagonal cut pliers 

9. Needlenose pliers 

10. Medium Phillips screwdriver 

11. Flat-bladed screwdriver 
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Table 6-1 Digital Part Numbers for Recommended Tools 



Technical Description Part Number 

Ballpoint hex screwdriver blade, 1/16" 29-26111-00 

Ballpoint hex screwdriver blade, 3/32" 29-26113-00 

Ballpoint hex screwdriver blade, 5/32" 29-26117-00 

Ballpoint hex screwdriver blade, 3/16" 29-26118-00 

Pliers, diagonal cutters, 4" 29-19328-00 

Pliers, long needlenose 29-13461-00 

Socket, Torx T-15 29-27275-01 

Screwdriver, Torx T-15 29-22772-00 

Screwdriver blade, Phillips # 1 29-11001-00 

Screwdriver blade, slotted, 3/16" 29-10988-00 

Screwdriver blade, Torx T-15 29-22772-00 

Screwdriver blade, Torx T-10 29-26947-01 

To remove the media from the HDA (refer to Figures 6-18 and 6-19): 

1. Remove the PCM from the HDA and store it in an ESD bag for return to Customer Services 
Logistics. Use proper ESD procedures. 

2. Remove the four Torx head screws, or three Torx head screws and one medium Phillips-head 
screw that secure the PCM plug to the HDA chassis. 

3. Remove the HDA from the drive chassis (refer to Section 6.11.1). 

4. Separate the HDA and carrier (refer to Section 6.11.4). 

5. Use a Phillips screwdriver to remove the actuator counterweight located at the end of the 
positioner shaft. 

6. Use a 3/8-inch open-end wrench or a pair of medium-sized needlenose pliers to hold the 3/8-inch 
nut on the positioner motor assembly located near the center of the shaft. This is a locking 
nut for an expander bolt holding the positioner coil assembly to the positioner shaft. Hold 

the nut and, at the same time, loosen the 3/32 Allen screw with a 3/32 Allen wrench. Turn 
counterclockwise until the 3/32 Allen screw, the 3/8-inch nut, and expander bolt assembly can 
be removed. 

7. Use a medium-sized Phillips screwdriver to remove the three retaining screws holding the 
positioner motor assembly to the HDA baseplate. 

8. Cut the flex leads from the positioner motor to the HDA electrical socket with diagonal cutters. 

9. Firmly grasp the positioner motor assembly at the end of the positioner shaft and lift up. If you 
have difficulty sliding the positioner motor assembly off the end of the positioner shaft: 

• Loosen the four crash stop Allen screws using a 5/32 and 1/16 Allen wrench. Turn screws in 
a counterclockwise direction. 

• Reattempt to remove the positioner motor assembly from the positioner shaft. 
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Figure 6-18 HDA Media Removal — Top View 
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Figure 6-19 HDA Media Removal — Bottom View 

10. Use a flat-bladed screwdriver to detach the spring clip that secures the positioner lock pin into 
the positioner collar. 

11. Remove the solenoid armature that connects the lock pin to the solenoid from the lock pin. 
Using a pair of needlenose pliers, remove the lock pin from the positioner shaft. 

12. Use a Tbrx T-15 wrench to remove the three screws used to secure the positioner/head assembly 
to the baseplate. 
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13. Use a Torx T-15 wrench to remove the two (in some cases three) screws that secure the internal 
airfoil. All Ibrx screws should now be removed from the bottom of the baseplate. 

14. Turn the HDA over to access top cover Tbrx-head screws. 

15. Use a Torx T-15 head wrench to remove the 13 top cover Tbrx-head screws (refer to 
Figure 6-18). 

16. Remove the top cover of the HDA. 

17. Remove the internal air filter assembly from the HDA. 

18. Remove the HDA filter fence from the HDA assembly. It may be necessary to rotate the 
positioner/head assembly so the heads are toward the inner guardband area of the media. 

19. Push the loose PCM plug out of the chassis and maneuver the PCM plug and its attached cable 
assembly so the positioner/head assembly can be removed from the chassis. 

20. Rotate the positioner out of the way as you manually unload the heads from the media. 

21. Lift the entire positioner/head assembly out of the HDA chassis. 

22. Use a Torx T-15 internal socket wrench to remove the six (6) male Tbrx-head screws securing 
the top clamp ring on the media stack, and lift clamp rings, media, and spacer rings from the 
spindle hub. 

23. Give the media to the customer. 

24. Collect all loose pieces of hardware and remove from the site. Return hardware to Customer 
Services Logistics for proper disposal. 
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7.1 Introduction 

This chapter describes the procedure for updating RA90/RA92 disk drive microcode when a new 
version of the microcode is released. 

7.2 Microcode Update Cartridge Description 

The microcode update cartridge is a ROM assembly that contains updated microcode for the 
RA90/RA92 disk drive microprocessor. Figure 7-1 shows the microcode update cartridge. 

To update the RA90/RA92 microcode, insert the cartridge in the microcode update port and run 
T40. 
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Figure 7-1 Microcode Update Cartridge 
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7-2 Microcode Update Procedure 



7.3 Microcode Update Port Description 

The microcode update port is a cutout in the operator control panel (OCP). It is located below and 
to the left of the Run switch. Figure 7-2 shows the location of the RA90/RA92 microcode update 
port. 

lb access the microcode update port, it is necessary to remove the cabinet front access panel. 
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Figure 7-2 Microcode Update Port 
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1A Running Test 40 (T40) 

T40 is a microcode subroutine used to load the new microcode from the microcode update cartridge 
into the master processor. The new microcode may be intended as a servo microcode update, a 
diagnostic update, or a functional microcode update. 

During update, the new microcode is downloaded to its destination EEPROM in three separate 
passes. Each pass takes approximately 20 seconds. The pass count is displayed in the OOP 
alphanumeric display during the update procedure. 

Pass one reads the cartridge, calculates and verifies the checksum in the cartridge, and verifies the 
microcode consistency codes. If pass one fails, the update is aborted and an appropriate error code 
is generated. 

Pass two writes the even pages in EEPROM (16 bytes). An even page is defined as BIT04 of the 
EEPROM address equal to zero. 

Pass three writes the odd pages in EEPROM. An odd page is defined as BIT04 of the EEPROM 
address equal to one. 

After the microcode is fully loaded (indicated by [C 40]), the drive performs a reset and goes 
through its normal power-up sequence of internal diagnostics. The OCP performs a reset, returns 
the drive to its normal operating state, and displays the unit address. 

7.5 Updating the Microcode 

Remove the cabinet front access panel before beginning the microcode update procedure. Refer to 
Section 6.6 for the front access panel removal procedure. 

Use the following procedure when updating drive microcode: 

1. Load the microcode update cartridge in the microcode update port. 

2. Load test T40 (drive must be spun down). 

3. Start test T40. 

The following occurs in the OCP display (where S = start, P = pass, C = completed): 

1. [S 40] (2 seconds). 

2. [P 1] (20 seconds) Pass one checks PROM to be loaded. 

3. [P 2] (20 seconds) Pass two writes the new code into the even pages in EEPROM. 

4. [P 3] (20 seconds) Pass three writes the new code into the odd pages in EEPROM. 

5. [C 40] (1 second) Update is complete. 

6. [WAIT1 (10 seconds) Exits test mode and goes through power-up hardcore sequence. 

7. [0000] Returns to display the drive unit address. 

Remove the microcode update cartridge from the OCP and replace the cabinet front access panel. 
Select the appropriate port switches to return the drive to the available state. 

7.5.1 Error Codes/Common Problems During Microcode Update 

The most common problems encountered during a microcode update are as listed by error code in 
Table 7-1. 
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Table 7-1 Common Error Codes/Problems During Microcode Update 



Error Code Reason 



Solution 



BD 
BC 

BE 
FD 



The microcode cartridge was not 
detected. 

The cartridge checksum was incorrect 



Cartridge and EPROM consistency 
check failed. 



An EEPROM checksum error 
occurred. 



Reseat the microcode update cartridge. 

Reseat the cartridge and retry the update. If 
it still fails, either replace the OCP or try 
the cartridge in another drive. Acquire a new 
microcode cartridge if necessary. 

Reseat the cartridge and try again. If the same 
error occurs, replace the cartridge with one 
containing compatible code. 

Attempt to reload the cartridge code. If the failure 
occurs again, electronic control module (ECM) 
replacement may be necessary. 
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Capturing Information for LARS and CHAMPS 



This appendix contains sample LARS for installation and general troubleshooting of field 
replaceable units (FRUs) in the field. 
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RA90/RA92 Error Recovery Levels 



RA90 and RA92 disk drives incorporate hardware error recovery as part of the RA90/RA92 circuitry. 
Read data circuitry is altered any time the controller issues error recovery commands. 

Generally, error recovery is used to assist the controller during unrecoverable or uncorrectable 
errors. The intent is to enhance the controller/disk interaction to recover data that might otherwise 
be lost. 

The RA90/RA92 hardware recovery circuitry is divided into six functional areas, as shown in 
Table B-l. 



Table B-1 RA90/RA92 Hardware Error Recovery Circuits 



Circuit 



Description 



READ THRESHOLD GAIN 



HOLD-OVER ONE-SHOT 



SKEW READ GATE 



FAST LOCK DELAY 



OFFSET OF HEADS 



WRITE DIAGNOSTICS 



There are two ways to increase the chances of reading data from a potentially 
bad spot on a disk: increase read threshold or decrease read threshold. The 
drive determines whether information coming off the disk is either too weak 
or too strong and consequently increases or decreases the read circuitry 
amplitude in an attempt to recover information. 

VCO control voltage is held stable to prevent large phase errors during a 
momentary loss of read pulses from the disk 

A delay of one or two byte times is introduced between the moment the SDI 
gate array (on the I/O-R/W module) receives the READ GATE signal from the 
SDI controller and the time the I/O-R/W module acts upon the READ GATE 
signal. The amount of delay (skew) changes for each revolution of the disk 
when the index pulse is received. The skew time is one byte time for odd 
revolutions of the disk and two byte times for even revolutions of the disk 

Fast lock delay is accomplished by the R/W ENDEC chip. The drive software 
enables fast lock delay through Misc. I/O Port (bit <4> ) with a 2.24- 
microsecond delay in addition to the delayed gate signal. 

Positive and negative offsets can be applied to the servo circuitry during 
attempted reads. Six combinations of offsets are utilized in the RA90. These 
include plus or minus offsets of 5%, 10%, 12.4%, or 20% of the track width. 

Thin-film heads can sometimes take on the characteristics of the magnetic 
media. The buildup of this magnetic field in the heads interferes with the 
drive's ability to read the surface of the disk Running write current through 
the heads usually breaks up the magnetic alignment of the thin-film heads 
substrata layers. This level of error recovery writes internal diagnostics 
within the dedicated inner guardband to eliminate this problem. With normal 
drive operations, this should rarely be a problem. 
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B-2 RA90/RA92 Error Recovery Levels 



The RA90/RA92 error recovery circuits are activated when the SDI controller issues an SDI ERROR 
RECOVERY command to the drive. This occurs after the controller has exhausted its read retry 
count (five for the RA90/RA92). An error recovery level is specified by the controller in the SDI 
ERROR RECOVERY command. The level number specifies which combination of error recovery 
circuits the drive is to employ. There is no controller intervention in the actual drive error recovery 
process. 

RA90 and RA92 disk drives employ 14 levels of error recovery, as shown in Table B-2. 
Table B-2 RA90/RA92 Error Recovery Levels 

Level Description 

14 Offset of heads by dedicated servo to +5% (offset is towards outer guardband). 

13 Offset of heads by dedicated servo to -5% (offset is towards inner guardband). 
12 Offset of heads by dedicated servo to +10%. 

11 Offset of heads by dedicated servo to -10%. 

10 Offset of heads by dedicated servo to +12.4%. 

9 Offset of heads by dedicated servo to -12.4%. 

8 Offset of heads by dedicated servo to +20%. 

7 Offset of heads by dedicated servo to -20%. 

6 Enable hold-over one shot. 

5 Fast lock delay level. 

4 Turn on low threshold. 

3 Turn on high threshold. 

2 Turn on read gate delay. 

1 Diagnostic writes (to clear head domain cluttering). 

NOP: This is the normal default state of the drive. No error recovery circuits are activated. 

The drive supplies the controller with the number of error recovery levels it has at its command. 
This is done by the drive in response to a GET COMMON CHARACTERISTICS command from the 
controller. The actual mechanism is transparent to the user, but works as follows: 

During a read data operation, the controller reads a block of data from the disk. If there are no 
ECC errors, data is passed to the host operating system. However, if the controller detects an ECC 
error, it compares the number of ECC symbols in error to the drive's ECC error symbol threshold. 
The RA90/RA92 disk drive has an error symbol threshold of six. 

As long as the error symbol threshold has not been reached, the controller can correct the data. If 
the error symbol threshold is equaled or exceeded, the drive then sends an error to the host error 
log and sets the BBR (bad block replacement) flag. The BBR process is actually implemented at a 
later time. 

The controller then determines if it can correct the data. If the data is uncorrectable, the controller 
examines the drive's common characteristics to determine the drive's read retry count parameter. 
The RA90/RA92 disk drive has a read retry count of five. 

If, after exhausting the read retry count on a block of data, the data is still uncorrectable, the 
controller determines if the drive has error recovery capabilities. The RA90/RA92 disk drive has 14 
error recovery levels (see Table B-2). The controller issues an ERROR RECOVERY command to the 
drive. The drive then initiates the first level of error recovery. In the case of the RA90/RA92, level 

14 is used first and the drive decrements down to zero. The RA90/RA92 activates the appropriate 
hardware circuits corresponding to a level 14 error recovery. The controller repeats the entire read 
data block process including, if necessary, the read retry process. 
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If the data has still not been recovered, the controller issues another ERROR RECOVERY 
command, this time specifying level 13. Again, the drive error recovery process starts and continues 
until the data has been recovered or all the error recovery levels have been tried. If the read retry 
operation mils and the error recovery levels mil, the controller returns an error to the host and 
BBR is implemented on that block of data. 

The error recovery mechanism is not restricted to ECC errors encountered during reads. Header- 
related errors may also cause the hardware error recovery levels to be implemented. 
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Customer Equipment Maintenance 



This appendix will assist customers in maintaining their equipment to ensure the highest level of 
equipment performance and reliability. Specifically, this appendix addresses the maintenance of 
60-inch storage array cabinet systems. 

C.1 Customer Responsibilities 

The customer is directly responsible for: 

* Supplying accessories, including storage racks, cabinetry, tables and chairs, as required. 

* Making the appropriate documentation available in a location convenient to the system. 

* Obtaining cleaning supplies specified in this appendix. 

* Performing the specific equipment maintenance described in this appendix. 

C.1.1 Cleaning Supplies 

Tb properly maintain the equipment, the customer must acquire the following items and supplies: 

* Vacuum cleaner with flexible hose and nonmetallic, soft-bristle brush attachment 

* Isopropyl alcohol (at least 91%) (Digital P/N 29-19665) 

* Lint-free tissues or cloths 

* All-purpose spray cleaner 

CAUTION 

When using spray cleaner, do not spray cleaner directly into computer equipment 

This could adversely affect equipment reliability or damage electrical components. 

C.1 .2 Ongoing Equipment Care 

The following should be performed on an ongoing basis: 

* Keep the immediate area in front of the storage array cabinets free of obstructions. 

* Keep the exterior of the cabinets and the surrounding area clean. Use a lint-free cloth and 
isopropyl alcohol to remove sticky residue left on painted surfaces by customer cabinet number 
labels, and so forth. 

* Maintain the site temperature/humidity to comply with Digital's recommended environmental 
range (reference product-specific documentation). This will ensure the highest product 
reliability and product life goals are achieved. 
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C-2 Customer Equipment Maintenance 
C.1.3 Monthly Equipment Maintenance 

The following tasks should be performed on a monthly basis, or more often if environment warrants: 

CAUTION 

Avoid touching the operator control panel switches daring cleaning operations. The 

state of the drives could change and affect the operation of the subsystem. 

• Vacuum and/or wipe top of storage array cabinet with a lint-free cloth. 

• With a soft-bristle brush attachment, vacuum the air vent grill on the front door of the storage 
array cabinet. Leave the front door assembly attached to the storage array cabinet while 
vacuuming. 

C.1.4 Maintenance Records 

Digital suggests the customer keep an accurate log of all equipment maintenance. A maintenance 
log form for 60-inch storage array cabinets is included in this appendix for customer use. This 
form may be reproduced and inserted in the customer's site management guide for record-keeping 
purposes. Refer to Figure C-l. 



DIGITAL INTERNAL USE ONLY 



Customer tquipment Maintenance 0-3 



CUSTOMER EQUIPMENT MAINTENANCE LOG 
FOR STORAGE ARRAY CABINETS 
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Figure C-1 Customer Equipment Maintenance Log for Storage Array Cabinets 
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Customer Services' Preventative Maintenance 



The information contained in this appendix will assist Digital Customer Services engineers in 
performing and planning preventative maintenance (PM) procedures for EA90/EA92 disk drive 
products. 

D.1 PM Checklist for R A90/RA92 Disk Drives 

The following preventative maintenance steps should be performed by Digital Customer Services on 
a scheduled basis at specified intervals. The PM checklist is a per storage element checklist. 

Due to the frequency of this activity, we suggest that you record this activity on the RA90/RA92 
Preventative Maintenance Activity Log provided in this section. This log sheet may be reproduced 
and inserted in the site management guide, as appropriate. 

One-Year Interval 

Perform the following PM steps at 1-year intervals: 

1. Utilize VAXsimPLUS to obtain the repair history of each disk drive. Examine the drive error 
profile over various lengths of time to determine whether a proactive repair may be warranted. 
Examination may include opening up the time window for the last week, last month, and last 3 
months. Deeper examination of error logs may be necessary if there are any error rate trends 
of concern. (Time: 10:00 minutes for basic error analysis with VAXsimPLUS) 

2. Remove the drive(s) from service. (Time: 2:00 minutes) 

3. Remove the cabinet front access panel or bezel assembly. Remove and clean each cabinet 
pre-filter or air vent grill as necessary. (Tune: 5:00 minutes) 

4. Determine the drive microcode revision levels by examining subsystem printouts or running 
drive test T45. Update microcode to the latest compatible functional revision as necessary, 
(lime: 3:00 minutes) 

5. From the rear of the cabinet at the I/O bulkhead panel, verify the SDI cables are dressed and 
routed in an orderly fashion to prevent the cables from being tripped over or stepped on. 

6. Verify the SDI connectors are securely attached to the I/O bulkhead panel. 

7. Return the drive(s) to service. 

The yearly PM steps can be accomplished in approximately 20 minutes per drive. Servicing more 
than one drive at a time will result in reduced time per drive. 
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TWo- Year Interval 

Perform the following PM steps at 2-year intervals: 

1. Remove the drive(s) from service. (Time: 2:00 minutes) 

2. Remove drive power. 

3. Remove the OCP and blower bezel assembly. Visually inspect the drive chassis interior for 
debris. If considerable dirt/lint is present, remove the electronic control module (ECM) assembly 
and head disk assembly (HDA) then vacuum the chassis. Reassemble the drive. (Time: 10:00 
minutes) 

4. Power up the drive and determine whether the blower motor quickly attains its speed and the 
drive becomes ready. (Time: 2:00 minutes) 

5. Execute drive internal test TOO for one pass. (Time: 10:00 minutes) 

6. Return the drive(s) to service. 

The 2-year interval PM steps can be accomplished in approximately 24 minutes per drive. Servicing 
more than one drive at a time will result in reduced time per drive. 

Five- Year Interval (for the HDA) 

In addition to the 1- and 2-year interval PM steps previously described, perform the following step 
at 5-year intervals: 

1. Remove and replace the spindle ground brush using procedures contained in this manual. 

The 5-year interval PM steps should be accomplished within 40 minutes per drive. 
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RA90/RA92 PREVENTATIVE MAINTENANCE ACTIVITY LOG 
FOR EACH RA90/RA92 STORAGE ELEMENT 



DRIVE TYPE (circle one) RA90 / RA92 

DRIVE SERIAL NUMBER 

CABINET 



CABINET S/N 





ECM 


HDA 


DATE OF 
SERVICE 


MAINTENANCE ACTIVITY 


MICROCODE 
REV LEVEL 
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REV 
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Acceptance testing, 2-13 

drive spun down, 2-18 

drive spun up, 2-19 
Add-on installation, 2-22 
Address selection, 2-20 

Ig 

BBR algorithms, 5-48 

BBR packet, 5-48 

Blower/bezel motor assembly removal, 

6-7 
Blower motor, dual outlet, 3-12 
Brake assembly removal, 6-17 



Cluster installation note, 2-20 
Controller byte, 5-5 
Correctable ECC errors, 5-48 
Cylinder address bytes, 5-6 



Data rates, 1-6 
Data storage capacity 

RA90 disk drive, 1-1 

RA92 disk drive, 1-1 
Deskidding cabinets, 2-5 
Diagnostics 

host-level, 5-37 

HSC-based, 5-37 

KDM-based, 5-37 

off-line, 5-37 

power-up, 2-16 

standalone, 5-37 

XDA controller-based, 5-38 
Diagnostics and utilities 

Average Seek liming test (T38), 4-14 

Clear DD Bit utflity (T55), 4-21 

Clear Seeks utility (T53), 4-21 

Clear Spinups utility (T54), 4-21 



Diagnostics and utilities (cont'd.) 
Display Drive Serial Number utility 

(T47), 4-20 
Display Error Log Errors utility (T41), 

4-16 
Display Seeks utility (T43), 4-17 
Display Spinups utility (T44), 4-18 
Display Time utility (T24), 4-17 
Drive Revision Level utility (T45), 

4-18 
Drive S/N Bus test (T04), 4-7 
Drive-Sensed Temperature Display 

utility (T29), 4-12 
Error Log Checkpoint utility (T50), 

4-21 
Gray Code (Track Counter) test (T29), 

4-12 
Guardband test (T30), 4-12 
Hardcore Sequence Test (T18), 4-11 
HDA Revision utility (T46), 4-20 
Head Select and One Seek test 

sequence (T24), 4-12 
Head Select test (T06), 4-8 
Head Select utility (T63), 4-22 
Head Switch Timing test (T39), 4-15 
idle loop tests (spun down), 4-2 
idle loop tests (spun up), 4-2 
Incremental Seek test (T31), 4-13 
individual descriptions, 4-5 
Loop-Off utility (T62), 4-22 
Loop-On-Error utflity (T61), 4-22 
Loop-On-Test utility (T60), 4-21 
Master CPU test, 4-5 
Master RAM test, 4-5 
Master ROM test (TOl), 4-6 
Master Timer test (T02), 4-6 
Minimum Seek Timing test (T36), 

4-14 
One Seek utility (T64), 4-23 
power-up, 4-1 
problem OOP displays, 4-3 
Random Seek test (T33), 4-13 
Read/Write Force Fault test (T16), 

4-11 
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Diagnostics and utilities (cont'd.) 

Read/Write Sequence test (T19), 4-11 
Read-Only Cylinder Formatter test 

(T17), 4-11 
Read-Only test (T14), 4-9 
SDI Loopback test (external) (T09), 

4-9 
SDI Loopback test (internal) (T08), 

4-9 
Sector/Byte Counter test (T07), 4-8 
Seek Parameter Input utility (T65), 

4-23 
sequence tests, 4-2 
Serial Communications Interface test, 

4-6 
Servo Data Bus Loopback test (T03), 

4-6 
Servo RAM test, 4-6 
Servo Spinup Sequence test (T20), 

4-11 
Tapered Seek test (T34), 4-13 
Toggle Seek test (T32), 4-13 
Total Drive Sequence test (spinning) 

(T22), 4-12 
Total Drive Sequence test (spun down) 

(T23), 4-12 
Total Servo Sequence test (T21), 4-11 
Update Cartridge utility (spun down) 

(T40), 4-15,7-5 
Variable Average Seek Timing test 

(T66), 4-26 
Write/Read test (T15), 4-9 
Documentation 
related, xiii 
troubleshooting, 5-1 
Drive unit address 

alternate display mode, 3-21 
programming, 3-19 



ECM 

description, 3-3 

I/O-R/W module, 3-3 

module types, compatibility, 3-3 

removal, 6-10 

servo module, 3-5 
Electrical specifications, 1-7 
Electronic control module 

See ECM 
Electrostatic protection 

See ESD protection 
Environmental limits, 1-7 
Error byte, 5-4 
Error code byte, 5-9 
Error codes 

during acceptance testing, 2-20 



Error codes (cont'd.) 

OCP, 2-18 
Error descriptions 

AO Unable to Clear SDI Array Safety 

Status Register, 5-93 
Al Unable to Force Encoder Error, 

5-93 
A2 Unable to Force Multiple Head 

Select While Reading, 5-93 
A3 Unable to Force Write Gate and 

Write Unsafe, 5-94 
A4 Unable to Force Write Current and 

No Write Gate, 5-94 
A5 Unable to Force Write Gate and No 

Write Current, 5-94 
A6 Unable to Force Read Gate and Off 

Track Error, 5-94 
A7 Unable to Force Write Gate and Off 

Track Error, 5-94 
A8 Unable to Force Read and Write 

Fault While Writing, 5-95 
A9 Servo Fault/Force Fault Test, 5-95 
AB Forced Read and Write Fault While 

Reading, 5-95 
4A Drive Disabled by Controller (DD 

Bit Set), 5-75 
AD UART Overrun or Framing Error, 

5-95 
5A Embedded Head Gain Calibration, 

5-78 
7A Embedded Of&et/Gain Calibration 

Timeout, 5-84 
AE OCP Data Packet Checksum Error, 

5-95 
AF OCP Start Byte is Not a Sync 

Character, 5-96 
9A Positioner Corrected Event During 

Data Transfer, 5-91 
QA SDI Incorrect Command Opcode 

Parity Error, 5-56 
1A SDI Invalid Cylinder Address, 

5-63 
2A SDI Invalid Subunit Specified, 

5-68 
8A Servo Processor Inside of 

Destination Track During Settle 

State, 5-88 
33 Attempt to Write Through Bursts, 

5-70 
6A Unable to Force No-Sync Error, 

5-81 
3A Write Gate and Write-Protected, 

5-72 
BO OCP Invalid Response, 5-96 
B2 OCP Retransmit Failure, 5-96 
B3 OCP Command Unsuccessful, 5-96 
B4 OCP Command Timeout, 5-97 
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Error descriptions (cont'd.) 

B6 Master Processor UART Loopback 

Test Failure, 5-97 
BS Master Processor UART 

Transmitter/Receiver Error, 5-97 
B9 OCP-to-Master Processor 

Communications Timeout Failure, 

5-97 
BA OCP NMI Timeout Failure, 5-97 
5B Bias Calibration Error, 5-78 
BB OCP Processor ROM Checksum 

Failure, 5-98 
BC Cartridge Checksum Failure, 5-98 
BD Microcode Update Cartridge 

Detection Failure, 5-98 
BE Cartridge/EEPROM/Master 

Processor Consistency Check, 

5-98 
BF Error Log Write Compare Error, 

5-99 
8B Gray Code Error After Settiing With 

Fine Track, 5-88 
3B Hard INTT Occurred to Drive, 5-72 
4B Index Error, 5-75 
IB Inner Guardband Error, 5-64 
7B Invalid Test While Spindle Running, 

5-84 
6B R/W Write/Read Test Overall 

Failure (Three or More Bad 

Heads), 5-81 
2B SDI Invalid Diagnose Memory 

Region Location, 5-68 
OB SDI Invalid Opcode, 5-56 
9B Write and Positioner Corrected 

Event, 5-91 
CO Hardware Revision and Microcode 

Incompatibility, 5-99 
CI Outer Guardband Detected After 

HEAD LOAD Command, 5-100 
C2 Inner Guardband Detected After 

HEAD LOAD Command, 5-100 
C3 Seek to Outer Guardband Failed, 

5-100 
C4 Seek to Outer Guardband Not 

Detected, 5-101 
C5 HDA and ECM Incompatibility, 

5-101 
C6 PLO Failure, 5-101 
C7 Seek to Inner Guardband Failed, 

5-101 
C8 Inner Guardband Not Detected 

After Seek to Inner Guardband, 

5-102 
C9 Analog Loop Test Failure, 5-102 
CA Media Not Spinning, 5-102 
98 Can't Execute Diagnostic/Jumper, 

5-91 



Error descriptions (cont'd.) 

64 Cannot Clear IE) Error Bits, 5-80 
67 Cannot Execute Write Test (Read- 
only Test Failed or Not Run First), 
5-80 

CC Servo Processor Recalibrate Failed, 

5-102 
CD Track Counter (Gray Code), 5-103 
CE EEPROM Write Cycle Timeout, 

5-103 
CF Invalid Data in EEPROM, 5-103 
7C Gray Code Match Error After 

Settling, 5-85 
5C Incorrect Diagnostic Index or Sector 

Pulse, 5-78 
1C Outer Guardband Error, 5-64 
6C R/W Write/Read Test Partial Failure 

(One or Two Bad Heads), 5-81 
9C Read Gate and Positioner Corrected 

Event, 5-92 
0C SDI Command Length Error 

(LVL2), 5-57 
4C SDI Invalid Write Memory Region 

Error, 5-75 
2C SDI Spindle Not Ready with 

Seek/Recalibration Command, 

5-68 
8C Uncalibrated and PLO Error, 5-88 
58 Dedicated Head Gain Calibration 

Error, 5-77 
79 Dedicated Servo Calibration 

Timeout Error, 5-84 
7D Embedded Interrupt Timeout, 

5-85 
9D Error Log Header Corrupted, 5-92 
3D HDA Read/Write Interlock Broken, 

5-72 

65 Diagnostic Index or Sector Not 

Detected, 5-80 
61 Diagnostic Index Sync Timeout 

Error, 5-79 
ID Illegal Servo Fault, 5-64 
8D Polarity Error on Velocity Command 

During a Multi-Track Seek, 5-88 
2D Power Supply Over-Temperature, 

5-69 
42 Drive Not On Iine/SEEK Command 

Issued, 5-73 
0D SDI Invalid Command with Drive 

Error, 5-57 
55 DSP Sanity Timeout After Load, 

5-77 
6D Unable to Force Read Gate and 

Write Gate Together, 5-82 
4D Write Gate and Bad Embedded 

Servo Information, 5-76 
E0 Spindle Rotation Not Detected, 

5-103 
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Error descriptions (cont'd.) 

El Spindle Speed Out Of Range, 

5-104 
E2 A/D or D/A Converter Insane, 

5-104 
E3 Excessive Positioner Current 

During Test, 5-104 
E4 Open Circuit Detected During 

Power Amp Toggle Test, 5-104 
E5 Overcurrent Detected During 

Actuator Test, 5-105 
E6 Track Counter Clear Failure, 

5-105 
E7 Illegal Zone Detected, 5-105 
E8 Outer Guardband Timeout, 5-106 
E9 Gray Code Timeout During the 

Turnaround State, 5-106 
EA Gray Code Timeout During Outer 

Guardband State, 5-107 
EB Sector Pulse Timeout During Sync- 
Up State, 5-107 
EC Servo Fault and PLO Fault Bit Set 

in GASP, 5-107 
9E Drive Faulted, Test Cannot Run, 

5-93 
ED Servo Watchdog Timeout, 5-107 
EE Servo Digital Signal Processor 

Reset, 5-107 
EF Head Unload Failed, 5-108 
7E Fine Track Lost After Settling, 

5-85 
22 Electronic Control Module Over- 
Temperature Error, 5-66 
8E Master Processor ROM/EEPROM 

Consistency Code Mismatch, 5-89 
59 Embedded Servo Offset Calibration 

Error, 5-78 
34 ENDEC Encoder Error, 5-70 
3E OCP Interlock Broken, 5-73 
IE Power-Up After AC Power Loss, 

5-64 
0E SDI Lvl 1 Invalid Select Group 

Number, 5-57 
2E SDI Spinup Inhibited by Controller 

Flags, 5-69 
6E Unable to Force Write Gate and 

Write Protect Error, 5-82 
F0 Servo Microcode Update Failed, 

5-108 
Fl Command to Servo Processor Timed 

Out, 5-108 
F3 Servo Spinup Failed, 5-108 
F4 Servo Spindown Failed, 5-109 
F5 Seek Failed, 5-109 
F6 Head Switch Failed, 5-109 
F7RTZ Failed, 5-109 
F8 Head Load Failed, 5-109 



Error descriptions (cont'd.) 

F9 Diagnostic Command Failed, 5-110 
FA Servo Processor Failed Seek to DGN 

Write Cylinder, 5-110 
FB Servo Processor Failed Seek to 

DGN Read Cylinder, 5-110 
FD EEPROM Checksum Error, 5-111 
6F Diagnostic Write Attempted While 

Write-Protected, 5-82 
8F EEPROM Checksum Failure, 5-89 
9F Error Log Check Point Code, 5-93 
4F Invalid Select Group (Level 1 

Command) - Not Read/Write 

Ready, 5-76 
44 Format Command and Format Not 

Enabled, 5-74 
2F SDI RUN Command with Run 

Switch in Stop Position, 5-69 
OF SDI Write Enable on a Write- 
Protected Drive, 5-57 
IF Sector Overrun Error, 5-65 
7F Servo Settling Timer Expired, 5-85 
77 Head Load Timeout Error, 5-84 

14 Head Offset Margin Event, 5-62 

15 Head Offset Out-of-Band Error, 

5-62 
54 Head Select Register Loopback 

Error, 5-77 
93 Inner Guardband/Servo Fault: No 

Interrupt Detected, 5-89 
92 Inner Guardband Without a Servo 

Fault Set, 5-89 
49 Invalid Command During 

TOPOLOGY Command, 5-75 

47 Invalid Disconnect Command/TT Bit 

Error, 5-74 

05 Invalid Drive Serial Number Code, 

5-55 
46 Invalid Hardware Fault, 5-74 

48 Invalid Write Memory Byte 

Counter/Offset Error, 5-75 
24 Loss of Fine Track During Data 

Transfer, 5-66 
88 Master Processor EEPROM Write 

Violation Error, 5-87 
85 Master Processor RAM Test Failure, 

5-87 
87 Master Processor ROM Checksum 

Failure, 5-87 
80 Master Processor ROM Consistency 

Code Mismatch, 5-86 
57 Master Processor Timer Failure, 

5-77 
11 Microcode Cartridge Load Occurred, 

5-58 

06 Microcode Fault, 5-55 
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Error descriptions (cont'd.) 

91 No Interrupt Detected During R/W 

Force Fault, 5-89 
74 Ofiset Timeout Error, 6-83 
60 Read/Write Head Select Failure, 

5-79 
38 Read Gate and Multiple Head Chips 

Selected, 5-71 
45 Read Gate and Off Track Both 

Asserted, 5-74 

31 Read Gate and Write Gate Both 

Asserted, 5-70 

32 Read or Write While Faulted, 5-70 

62 Read Test Overall Read Failure 

(Three or More Bad Heads), 5-79 

63 Read Test Partial Failure (One or 

Two Bad Heads), 5-79 
66 Read Test Servo Failure, 5-80 
71 Recalibrate Timeout Error, 5-82 
10 SDI Command Length Error (LVL2), 

5-58 
96 SDI Failure: Port B, 5-90 

07 SDI Frame Sequence Error, 5-55 
29 SDI Invalid Error Recovery Level 

Specified, 5-68 

19 SDI Invalid Format Request, 5-63 

16 SDI Invalid Group Select LVL2, 

5-62 
40 SDI Invalid Read Memory Region 
Error, 5-73 

94 SDI Loopback Test Failure on Both 

Ports, 5-90 
09 SDI Lvl 1 Framing Error, 5-56 

08 SDI Lvl 2 Checksum Error, 5-56 

17 SDI Port A Command/Response 

Timeout, 5—63 

18 SDI Port B Command/Response 

Timeout, 5-63 

20 SDI RTCS Parity Error, 5-65 

95 SDI Test Failure: Port A, 5-90 

21 SDI Transfer (Pulse) Error, 5-65 
51 Sector/Byte Counter Error, 5-76 
89 Seek Speed Out of Range, 5-87 
50 Servo Data Bus Failure, 5-76 
25 Servo Fault Error, 5-66 

27 Servo Over-Temperature Error at 

51, 5-67 

28 Servo Over-Temperature Error at 

52, 5-67 

78 Servo Processor Bias Force 
Calibration Timeout, 5-S4 

82 Servo Processor Coarse Velocity 

State Timeout, 5-86 

83 Servo Processor Fine Velocity State 

Timeout, 5-86 
73 Servo Processor Head Switch 
Timeout, 5-83 



Error descriptions (cont'd.) 

53 Servo Processor Offset Error, 5-77 
76 Servo Processor Sanity Timeout, 

5-83 
84 Servo Processor Seek Direction 

Error, 5-87 
72 Servo Processor Seek Timeout, 

5-83 
81 Servo Processor Settle State 

Timeout, 5-86 
70 Servo Processor Spinup Timeout, 

5-82 
75 Servo Processor Unload Timeout, 

5-83 
56 Servo RAM Test Failure (High Byte 

of Address), 5-77 
52 Servo RAM Test Failure (Low Byte 

of Address), 5-76 
13 Spindle Motor Control Fault, 5-59 
01 Spindle Motor Transducer Timeout, 

5-54 

01 O Spindle Motor Transducer 

Timeout #, 5-53 

03 Spindle Not Accelerating During 

Spinup, 5-54 
26 Spindle Speed Error (Servo 

Processor), 5-67 
12 Spindle Speed Unsafe Error, 5-58 

04 Spinup Too Long to Lock on Speed, 

5-54 

02 Spinup Too Slow, 5-54 
86 Static RAM Failure, 5-87 

43 TCR and Not Read/Write Ready 
Fault, 5-74 

68 This Diagnostic Cannot Execute 

Without Software Jumper, 5-81 

69 Unable to Force Compare Error, 

5-81 
90 Unable to Force Index Error, 5-89 

36 Write and Servo Uncalibrated, 

5-71 
35 Write and Write Unsafe, 5-71 
30 Write Current and No Write Gate, 

5-69 

37 Write Gate and No Write Current, 

5-71 

39 Write Gate and Off Track, 5-72 
Error logs, 1-4 
Error recovery level byte, 5-9 
Error recovery levels, B— 1 
Error recovery Levels 

NOP: no operation., B-2 
Errors related to media 

See media errors 
ESD protection, 1-8 

wrist strap use, 1-8 
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Fault display mode setup, 3-16 

Floor loading, 2-3 

Front access panel, removal, 2-7 



H 



HDA 

brake assembly removal, 6-17 

carrier separation, 6-14 

description, 3-10 

hardware compatibility, 3-12 

installation, 6-14 

removal, 6-12 

spindle ground brush removal, 6-16 
HDA preventative maintenance, D-2 
HDA revision bits byte, 5-6 
Host error logs, 5-2 



I 



I/O-R/W module 

description, 3-3 

hardware revision matrix, 3-5 
Idle loop testing, 2-16 
Input current (amps), 1-7 
Inrush current, 1-6 
Installation note, cluster, 2-20 



Labeling, OCP, 2-13 
Lamp test, OCP, 2-16 
LARS examples, A-l 
Latency, 1-6 
Level A Retry, 5-49 
Level B Retry, 5-49 
Leveling cabinets, 2-6 
Logical media layout, 1-3 

M 

Maintenance activity log, C-3, D-2 
Maintenance strategy, 1-3, 1-4 
Manufacturing fault code, 5-9 
Media errors, 5-32 

drive or controller port not defined 

(random R/W errors), 5-35 
excessive number of blocks replaced 
because of R/W path problems, 
5-33 
isolating random R/W transfer errors, 

5-35 
LBN correlated to a physical cylinder, 

5-34 
LBN correlation to a single group 
(head), 5-33 



Media errors (cont'd.) 

LBN correlation to multiple groups 
(heads), 5-34 

LBNs correlated to zone write 
boundaries, 5-34 

multiple controllers report same errors, 
5-35 

repeating LBNs/RBNs, 5-33 

single controller port affected, 5-35 
Media removal service, 6-25 
Microcode 

compatibility with drive FRUs, 3-13 
Microcode update procedure, 7-3 

microcode update cartridge description, 
7-1 

running T40, 7-3 

update port description, 7-2 
Mode byte, 5-4 
MSCP status/event 

6B, 5-49 
MSLG$_LEVEL, 5-46 
MSLGOETRY, 5-46 

N 

Normal mode setup, 3-15 



OCP 

functions, 3-14 

removal, 6-6 
OCP error codes, 2-18 
OCP labeling, international, 2-13 
OCP lamp test, 2-16 
Online 

placing drive on line, 2-20 
Operating temperature and humidity, 

2-3 
Operator Control Panel 

See OCP 



Part numbers, ECM components, 3-3 
Parts removal sequence, 6-3 
PCM 

description, 3-7 

removal, 6-11 

switch pack settings, 3-9 
Phase requirements, 2-1 
Physical characteristics, 1-6 
Physical media layout, 1-3 
Positioner errors, 5-49 
Power, applying to drive, 2-14 
Power and safety precautions, 2-1 
Power cord connections, 2-11 
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Power dissipation, 1-7 
Power supply 

available voltages, 3-12 

removal, 6-22 
Power supply location, drive, 2-12 
Power-up 

resident diagnostics, 2-16 
Preamp control module 

See PCM 
Preventative maintenance 

customer responsibilities, C-l 

Customer Services' responsibilities, 
D-l 

maintenance acravi^ log, D-2 
Previous command opcode byte, 5-6 
Programming the unit address, 2-20 



R 



Rear access panel, removal, 2-9 
Rear flex cable removal, 6-23 
Removal/replacement procedures 
bezel and blower motor assembly 

separation, 6-9 
blower/bezel motor assembly removal, 

6-7 
brake assembly removal, 6-17 
contact extraction tool, 6-20 
ECM removal, 6-10 
front access panel removal, 6-4 
FRUs, sequence for removal, 6-3 
HBA and carrier separation, 6-14 
HDA installation, 6-14 
HDA removal, 6-12 
media removal service, 6-25 
OCP removal, 6-6 
PCM removal, 6-11 
power supply removal, 6-22 
rear access panel removal, 6-4 
rear flex cable removal, 6-23 
solenoid removal, 6-22 
spindle ground brush removal, 6-16 
spindle lock solenoid failure, 6-20 
tools checklist, 6-3 
Request byte, 5-3 
Response opcode byte, 5-3 
Retry count byte, 5-5 



SDI cable connections, 2-10 
Sector format, 1-1 
Seek times, 1-5, 4-5 
Sequence diagnostics, 4-2 
Service delivery strategy, 1-4 
Servo module 
description, 3-5 



Servo module (cont'd.) 

hardware revision matrix, 3-7 
Site preparation and planning, 2-1 
Software jumper, 4-4 
Specifications, RA90/RA92, 1-5 
Spindle ground brush removal, 6-16 
Spindle lock solenoid failure, 6-20 
Start/stop time, 1-6 
Status/event codes 

14, 5-48 

34, 5-46,5-48,5-49 

54, 5-48 

74, 5-48 

94, 5-48 

2A, 5-32 

1A8, 5-48 

1AB, 5-31 

AB, 5-31 

14B, 5-29 

4B, 5-29 

10B, 5-29 

8B, 5-30 

16B, 5-31 

18B, 5-31 

2B, 5-32 

6B, 5-49 

B4, 5-48 

1C8, 5-48 

CB, 5-30 

D4, 5-48 

E8, 5-44,5-49 

1E8, 5-48 
Status bytes 

extended, 5-2 

generic, 5-4 
Subunit mask byte, 5—3 



Temperature, affect on drive performance, 

4-5 
lest selection from OCP, 2-16 
Theory 

drive operations and theory, 3-1 
Thermal stabilization, 2-3 
Tools checklist, 6-3 
Training, 5-1 
Troubleshooting 

bad block replacement (BBR), 5-24 
controller byte, 5-5 
controller-detected communication 

events and faults, 5-30 
controller-detected drive clock dropout, 

5-31 
controller-detected drive failed 
initialization, 5-31 
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Troubleshooting (conf d.) 

controller-detected drive ignored 

initialization, 5—31 
controller-detected EDC errors, 5-28 
controller-detected loss of read/write 

ready, 5-30 
controller-detected lost receiver ready, 

5-30 
controller-detected protocol and 

transmission errors without 

communications errors, 5—29 
controller-detected pulse or state parity 

errors, 5-29 
controller-detected receiver ready 

collision, 5-31 
controller-detected SERDES error, 

5-32 
correctable ECC errors, 5—48 
cylinder address bytes, 5-6 
data collection steps, 5-26 
DBN conversion, RA90, 5-6 
DBN conversion, RA92, 5-8 
drive-detected drive errors and 

diagnostic faults (DDDE), 5-27 
drive-detected protocol errors without 

communication errors (DDPE), 

5-27 
drive-detected pulse or state parity 

errors, 5-27 
drive internal error log, 5-9, 5-27 
drive-resident utility dump (T41), 

5-14 
error byte, 5-4 
error code byte, 5-9 
error recovery level byte, 5-9 
error reporting mechanisms, 5-1, 5—15 
exiting data collection/action list 

process, 5-39 
extended status bytes, 5-2 
FRU replacement stage, 5—40 
general information, 5-16 
HDA revision bits byte, 5-6 
host console/user terminal trails, 5-24 
host error log, 5-25 
host error logs, 5-2, 5-23 
host-level diagnostics, 5-37 
host-level diagnostics and utilities, 

5-16 
HSC-based diagnostics, 5-37 
HSC console log, 5-24, 5-26 
HSC console utility: DKUTIL, 5-12 
identifying the problem drive, 5-23 
identifying the problem FRU, 5-24 
KDM-based diagnostics, 5-37 
LBN conversion, RA90, 5-6 
LBN conversion, RA92, 5-8 
manufacturing fault code, 5-9 



Troubleshooting (cont'd.) 

miscellaneous checks, 5-36 

mode byte, 5-4 

OCP fault indicator/error codes, 5-14, 

5-25 
off-line diagnostics, 5-37 
other means (to identify problem drive), 

5-24 
performance issues when no errors are 

being logged, 5-41 
post-verification testing, 5-40 
Power OK indicator, 5-14 
pre-verifying drive symptoms, 5-25 
previous command opcode byte, 5-6 
priority order of DSA errors, 5-27 
RBN conversion, RA90, 5-6 
RBN conversion, RA92, 5-8 
receiver ready collisions: acceptable 

rates, 5-31 
receiver ready collisions: unacceptable 

rates, 5-31 
recommended training, 5-1 
reference material, 5-1 
request byte, 5-3 

resident diagnostics limitations, 5-16 
response opcode byte, 5-3 
retry count byte, 5-5 
returning disk to customer, 5-41 
SDI drive command timeout, 5-32 
standalone diagnostics, 5-37 
status/event 6B, 5-52 
step-by-step procedure, 5-16 
subunit mask byte, 5-3 
uncorrectable ECC errors, 5-44 
unit number low byte, 5-3 
unusual problems, 5-36 
VAXsimPLUS, 5-2,5-23,5-25 
VMS mount verification, 5-42 
worksheet, 5-23 
XBN conversion, RA90, 5-6 
XBN conversion, RA92, 5-8 
XDA controller-based diagnostics, 

5-38 



u 



Uncorrectable ECC errors, 5-44 

hard, 5-44 

soft, 5-46 
Unit address 

see drive unit address 
Unit number low byte, 5-3 
Unpacking, 60-inch cabinets, 2-3 
Updating microcode 

See microcode update procedure 
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VAXsimPLUS, 5-2 
Voltage (frequency) selection 
power supply, 2-13 



